《hadoop进阶》PeopleRank从社交关系中挖掘价值用户
发布时间:2021-03-11 21:09:09 所属栏目:大数据 来源:网络整理
导读:转载请注明出处: 转载自? Thinkgamer的CSDN博客: blog.csdn.net/gamer_gyt 代码下载地址:点击查看 1:PageRank 与 PeopleRank 2:需求分析:挖掘CSDN博客的价值用户 3:算法模型:PeopleRank算法 4:架构设计:从数据准备到PR算法的MR化 5:程序开发:had
|
下面只对部分代码进行展示,更多请前往github下载:点击查看 dataEtl.java package pagerankjisuan;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
public class dataEtl {
public static void main() throws IOException {
File f1 = new File("MyItems/pagerankjisuan/people.csv");
if(f1.isFile()){
f1.delete();
}
File f = new File("MyItems/pagerankjisuan/peoplerank.txt");
if(f.isFile()){
f.delete();
}
//打开文件
File file = new File("MyItems/pagerankjisuan/day7_author100_mess.csv");
//定义一个文件指针
BufferedReader reader = new BufferedReader(new FileReader(file));
try {
String line=null;
//判断读取的一行是否为空
while( (line=reader.readLine()) != null)
{
String[] userMess = line.split( "," );
//第一字段为id,第是个字段为粉丝列表
String userid = userMess[0];
if(userMess.length!=0){
if(userMess.length==11)
{
int i=0;
String[] focusName = userMess[10].split("|"); // | 为转义符
for (i=1;i < focusName.length; i++)
{
write(userid,focusName[i]);
// System.out.println(userid+ " " + focusName[i]);
}
}
else
{
int j =0;
String[] focusName = userMess[9].split("|"); // | 为转义符
for (j=1;j < focusName.length; j++)
{
write(userid,focusName[j]);
// System.out.println(userid+ " " + focusName[j]);
}
}
}
}
}
catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
finally
{
reader.close();
//etl peoplerank.txt
for(int i=1;i<=100;i++){
FileWriter writer = new FileWriter("MyItems/pagerankjisuan/peoplerank.txt",true);
writer.write(i + "t" + 1 + "n");
writer.close();
}
}
System.out.println("OK..................");
}
private static void write(String userid,String nameid) {
// TODO Auto-generated method stub
//定义写文件,按行写入
try {
if(!nameid.contains("n")){
FileWriter writer = new FileWriter("MyItems/pagerankjisuan/people.csv",true);
writer.write(userid + "," + nameid + "n");
writer.close();
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
prjob.java (编辑:清远站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |


