字符串匹配与最大出现次数

我有这个长的字符串在这里,有像这样的1000行在一个文本文件中。我想计算每个日期在该文本文件中出现的频率。任何想法如何可以我那样做?字符串匹配与最大出现次数

{"interaction":{"author":{"id":"53914918","link":"http:\/\/twitter.com\/53914918","name":"ITTIA","username":"s8c"},"content":"RT @fubarista: After thousands of years of wars I am not an optimist about peace. The US economy is totally reliant on war. It is the on ...","created_at":"Sun, 10 Jul 2011 08:22:16 +0100","id":"1e0aac556a44a400e07497f48f024000","link":"http:\/\/twitter.com\/s8c\/statuses\/89957594197803008","schema":{"version":2},"source":"oauth:258901","type":"twitter","tags":["attretail"]},"language":{"confidence":100,"tag":"en"},"salience":{"content":{"sentiment":4}},"twitter":{"created_at":"Sun, 10 Jul 2011 08:22:16 +0100","id":"89957594197803008","mentions":["fubarista"],"source":"oauth:258901","text":"RT @fubarista: After thousands of years of wars I am not an optimist about peace. The US economy is totally reliant on war. It is the on ...","user":{"created_at":"Mon, 05 Jan 2009 14:01:11 +0000","geo_enabled":false,"id":53914918,"id_str":"53914918","lang":"en","location":"Mouth of the abyss","name":"ITTIA","screen_name":"s8c","time_zone":"London","url":"https:\/\/thepiratebay.se"}}}

回答:

利用类型的RandomAccessFile和BufferedReader在部分读取数据,你可以使用字符串解析计算每个日期的频率...

回答:

每一日期,有一些稳定的格局,像\ d \ d(Jan | Feb | ...)20 \ d \ d 因此,您可以使用正则表达式(Java中的模式类) 提取这些日期,然后您可以使用HashMap来增加某些键的值,其中键是找到的日期。对不起,没有代码,但我希望可以帮助你:)

回答:

我就是它的一个JSON字符串你应该解析它而不是匹配。 看到这个例子HERE

回答:

复制所需的字符串test.text,并将其放在C盘 工作的代码,我已经使用Pattern和Matcher类

的模式,我给你问日期的模式,你可以检查这里的模式

“(太阳|星期一|星期二|星期三|星期四|星期五|星期六] [,] \ d \ d(Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)\ d \ d \ d \ d“

检查代码

import java.io.BufferedReader; 

import java.io.FileReader;

import java.util.regex.Matcher;

import java.util.regex.Pattern;

class Test{

public static void main(String[] args) throws Exception {

FileReader fw=new FileReader("c:\\test.txt");

BufferedReader br=new BufferedReader(fw);

int i;

String s="";

do

{

i=br.read();

if(i!=-1)

s=s+(char)i;

}while(i!=-1);

System.out.println(s);

Pattern p=Pattern.compile

(

"(Sun|Mon|Tue|Wed|Thu|Fri|Sat)[,] \\d\\d (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \\d\\d\\d\\d"

);

Matcher m=p.matcher(s);

int count=0;

while(m.find())

{

count++;

System.out.println("Match number "+count);

System.out.println(s.substring(m.start(), +m.end()));

}

}

}

非常好的描述在这里Link 1和Link 2

回答:

你输入的字符串是JSON格式,因此,我建议你使用JSON解析器,这使得分析很多容易,更重要的强劲!尽管进入JSON解析可能需要几分钟的时间,但它是值得的。

之后,解析“created_at”标签。创建您的日期键和值的计数一个地图和写类似:

int estimatedSize = 500; // best practice to avoid some HashMap resizing 

Map<String, Integer> myMap = new HashMap<>(estimatedSize);

String[] dates = {}; // here comes your parsed data, draw it into the loop later

for (String nextDate : dates) {

Integer oldCount = myMap.get(nextDate);

if (oldCount == null) { // not in yet

myMap.put(nextDate, Integer.valueOf(1));

}

else { // already in

myMap.put(nextDate, Integer.valueOf(oldCount.intValue() + 1));

}

}

以上是 字符串匹配与最大出现次数 的全部内容, 来源链接: utcz.com/qa/266178.html

回到顶部