7-46 新浪微博热门话题 (30 分)

新浪微博可以在发言中嵌入“话题”，即将发言中的话题文字写在一对“#”之间，就可以生成话题链接，点击链接可以看到有多少人在跟自己讨论相同或者相似的话题。新浪微博还会随时更新热门话题列表，并将最热门的话题放在醒目的位置推荐大家关注。

本题目要求实现一个简化的热门话题推荐功能，从大量英文（因为中文分词处理比较麻烦）微博中解析出话题，找出被最多条微博提到的话题。

输入格式:

输入说明：输入首先给出一个正整数$N$（≤105），随后$N$行，每行给出一条英文微博，其长度不超过140个字符。任何包含在一对最近的#中的内容均被认为是一个话题，输入保证#成对出现。

输出格式:

第一行输出被最多条微博提到的话题，第二行输出其被提到的微博条数。如果这样的话题不唯一，则输出按字母序最小的话题，并在第三行输出And k more ...，其中k是另外几条热门话题的条数。输入保证至少存在一条话题。

注意：两条话题被认为是相同的，如果在去掉所有非英文字母和数字的符号、并忽略大小写区别后，它们是相同的字符串；同时它们有完全相同的分词。输出时除首字母大写外，只保留小写英文字母和数字，并用一个空格分隔原文中的单词。

输入样例:

4
This is a #test of topic#.
Another #Test of topic.#
This is a #Hot# #Hot# topic
Another #hot!# #Hot# topic

输出样例:

1
2
3

Hot
2
And 1 more ...

#include<iostream>
#include<string>
#include<cstring>
#include<map>
#include<set>
 
using namespace std;
 
int N;
map<string, set<int> > belongs;
 
int main() {
	scanf("%d", &N);
	getchar();
	for(int i = 0; i < N; i++) {
		char input[141];
		scanf("%[^\n]", input);
		getchar();
		string str = "";
		int flag = 0;
		for(int j = 0; j < strlen(input); j++) {
			if(input[j] == '#') {
				flag++;
				if(flag == 2) {
					flag = 0;
					if(str.length() != 0) {
						belongs[str].insert(i);
					}
					str = "";
				}
				continue;
			}
			if(flag == 1) {
				if(input[j] >= 'A' && input[j] <= 'Z') {	//将大写字母转变成小写字母 
					input[j] = input[j] - 'A' + 'a';
				}
				if((input[j] >= 'a' && input[j] <= 'z') || 
					(input[j] >= '0' && input[j] <= '9') || input[j] == ' ') {	 //如果是字母或数字或空格，就加入字符串中 
					str += input[j];
				}else if(j + 1 < strlen(input) && ((input[j + 1] >= 'a' && input[j + 1] <= 'z') || 
					(input[j + 1] >= 'A' && input[j + 1] <= 'Z') || (input[j + 1] >= '0' && input[j + 1] <= '9'))){	
					//如果下一个字符是字母或数字且下一个字符不超出input的长度范围，则将其替换为空格 
					str += ' ';
				}
			}
		}
	}
	int maxTimes = 0;	//记录最热门的词出现话题数量 
	string tempResult;	//记录最热门的词 
	for(map<string, set<int> >::iterator it = belongs.begin(); it != belongs.end(); it++) {
		if(it->second.size() > maxTimes) {
			tempResult = it->first;
			maxTimes = it->second.size();
		}
	}
	int other = 0;	//计算与最热门词对应话题数量有着相同数量的话题总数 
	for(map<string, set<int> >::iterator it = belongs.begin(); it != belongs.end(); it++) {
		if(it->second.size() == maxTimes) {
			other++;
		}
	}
	string result = "";
	while(tempResult.length() > 0 && tempResult[0] == ' '){	//去除前导空格 
		tempResult.erase(tempResult.begin());
	}
	while(tempResult.length() > 0 && tempResult[tempResult.length() - 1] == ' '){	//去除后导空格 
		tempResult.erase(tempResult.end());
	}
	for(int i = 0; i < tempResult.length(); i++) {	//去除单词之间多余的空格 
		if(i > 0 && tempResult[i - 1] == ' ' && tempResult[i] == ' ') {
			continue;
		}
		result += tempResult[i];
	}
	int count = belongs[result].size();	
	if(result[0] >= 'a' && result[0] <= 'z') {
		result[0] = result[0] - 'a' + 'A';
	}
	printf("%s\n%d\n", result.c_str(), count);
	if(other > 1) {
		printf("And %d more ...\n", other - 1);
	}
	return 0;
}

# 数据结构与算法题目集（中文）

支付寶

WeChat

PayPal Patreon

auhanjie

Zhuhai, China

文章

465

分類

標籤

追蹤

7-46 新浪微博热门话题 (30 分)

7-46 新浪微博热门话题 (30 分)

输入格式:

输出格式:

输入样例:

输出样例:

喜歡這篇文章嗎? 贊助一下作者吧!

連結

分類

標籤雲

最新文章

彙整

標籤

最新文章

彙整

標籤

Your browser is out-of-date!