我想将整个邮件存档导出为纯文本格式,以便搜索。这可行吗?
答案1
Yahoo! Messenger 存档查看器可以为您做到这一点。有些功能需要付费,但导出为纯文本是免费的。它还允许您搜索您的档案。
答案2
我能找到的唯一与你要求的类似的是这个实用程序:雅虎消息存档解码器。它似乎运行良好,尽管有些内容(如电子邮件地址)会显示乱码,而且您必须付费才能看到它们。
答案3
我破解了这个 PHP 脚本来导出我的档案。正如评论中指出的那样,大部分代码都是由聪明人从另外两个脚本中获取的。
适用于 Yahoo Messenger 10 及以下版本,其中所有消息存档都在本地完成,但更高版本则不行。我使用 Windows XP 来运行我刚刚安装的 PHP 脚本维特里戈服务公司并将其作为 Web 服务器的索引文件,然后打开
http://localhost/
在我的浏览器中。
所有匹配的文件按修改时间的顺序输出,最旧的文件优先。
配置这些参数以适应:
// Where your Yahoo Messenger files are kept
$archive_dir = 'C:/Program Files/Yahoo!/Messenger/Profiles';
// Regular expression filtering the path names of file for particular
// user names (e.g. I want all the messages I've exchanged with john or
// bob
$archive_people_files ='/Messages.*(john|bob)/';
// Output file name
$output_file_name = 'ym_output.txt';
// Name of my account
$from_name = 'geoffs_account_name';
// My name as I want it to appear in the output
$to_name = 'geoff';
完整脚本:
<?php
// Put together using these scripts:
// http://www.pgregg.com/projects/php/preg_find/preg_find.php.txt
// http://www.0xcafefeed.com/2007/12/yahoo-messenger-archive-file-format/
// Found on http://superuser.com/questions/130196/yahoo-messenger-how-to-mass-export-entire-message-archive-as-text-files
/**
* Usage:
* ./decodeYahoo.php account contact file
*
* 'account' is the Y!M account name.
* 'recipient' is the name of the contact
* 'file' is the file to parse.
*
* Please note: There is no error checking
* in the code below. If you want to use
* this code for anything important, please
* add some. Also, fopen/fread would be more
* memory efficient than a file_get_contents,
* but again, I'm being super lazy today. (o:
*/
date_default_timezone_set("Europe/London");
// Where your Yahoo Messenger files are kept
$archive_dir = 'C:/Program Files/Yahoo!/Messenger/Profiles';
// Regular expression filtering the path names of file for particular user names (e.g. I want all the messages I've exchanged with john or bob
$archive_people_files ='/Messages.*(john|bob)/';
// Output file name
$output_file_name = 'ym_output.txt';
// Name of my account
$from_name = 'geoffs_account_name';
// My name as I want it to appear in the output
$to_name = 'geoff';
$entry_name="";
$rg_files=array();
/*
* Find files in a directory matching a pattern
*
*
* Paul Gregg <[email protected]>
* 20 March 2004, Updated 20 April 2004
* Updated 18 April 2007 to add the ability to sort the result set
* Updated 9 June 2007 to prevent multiple calls to sort during recursion
* Updated 12 June 2009 to allow for sorting by extension and prevent following
* symlinks by default
* Version: 2.3
* This function is backwards capatible with any code written for a
* previous version of preg_find()
*
* Open Source Code: If you use this code on your site for public
* access (i.e. on the Internet) then you must attribute the author and
* source web site: http://www.pgregg.com/projects/php/preg_find/preg_find.phps
* Working examples: http://www.pgregg.com/projects/php/preg_find/
*
*/
define('PREG_FIND_RECURSIVE', 1);
define('PREG_FIND_DIRMATCH', 2);
define('PREG_FIND_FULLPATH', 4);
define('PREG_FIND_NEGATE', 8);
define('PREG_FIND_DIRONLY', 16);
define('PREG_FIND_RETURNASSOC', 32);
define('PREG_FIND_SORTDESC', 64);
define('PREG_FIND_SORTKEYS', 128);
define('PREG_FIND_SORTBASENAME', 256); # requires PREG_FIND_RETURNASSOC
define('PREG_FIND_SORTMODIFIED', 512); # requires PREG_FIND_RETURNASSOC
define('PREG_FIND_SORTFILESIZE', 1024); # requires PREG_FIND_RETURNASSOC
define('PREG_FIND_SORTDISKUSAGE', 2048); # requires PREG_FIND_RETURNASSOC
define('PREG_FIND_SORTEXTENSION', 4096); # requires PREG_FIND_RETURNASSOC
define('PREG_FIND_FOLLOWSYMLINKS', 8192);
// PREG_FIND_RECURSIVE - go into subdirectorys looking for more files
// PREG_FIND_DIRMATCH - return directorys that match the pattern also
// PREG_FIND_DIRONLY - return only directorys that match the pattern (no files)
// PREG_FIND_FULLPATH - search for the pattern in the full path (dir+file)
// PREG_FIND_NEGATE - return files that don't match the pattern
// PREG_FIND_RETURNASSOC - Instead of just returning a plain array of matches,
// return an associative array with file stats
// PREG_FIND_FOLLOWSYMLINKS - Recursive searches (from v2.3) will no longer
// traverse symlinks to directories, unless you
// specify this flag. This is to prevent nasty
// endless loops.
//
// You can also request to have the results sorted based on various criteria
// By default if any sorting is done, it will be sorted in ascending order.
// You can reverse this via use of:
// PREG_FIND_SORTDESC - Reverse order of sort
// PREG_FILE_SORTKEYS - Sort on the keyvalues or non-assoc array results
// The following sorts *require* PREG_FIND_RETURNASSOC to be used as they are
// sorting on values stored in the constructed associative array
// PREG_FIND_SORTBASENAME - Sort the results in alphabetical order on filename
// PREG_FIND_SORTMODIFIED - Sort the results in last modified timestamp order
// PREG_FIND_SORTFILESIZE - Sort the results based on filesize
// PREG_FILE_SORTDISKUSAGE - Sort based on the amount of disk space taken
// PREG_FIND_SORTEXTENSION - Sort based on the filename extension
// to use more than one simply seperate them with a | character
// Search for files matching $pattern in $start_dir.
// if args contains PREG_FIND_RECURSIVE then do a recursive search
// return value is an associative array, the key of which is the path/file
// and the value is the stat of the file.
Function preg_find($pattern, $start_dir='.', $args=NULL) {
static $depth = -1;
++$depth;
$files_matched = array();
$fh = opendir($start_dir);
while (($file = readdir($fh)) !== false) {
if (strcmp($file, '.')==0 || strcmp($file, '..')==0) continue;
$filepath = $start_dir . '/' . $file;
if (preg_match($pattern,
($args & PREG_FIND_FULLPATH) ? $filepath : $file)) {
$doadd = is_file($filepath)
|| (is_dir($filepath) && ($args & PREG_FIND_DIRMATCH))
|| (is_dir($filepath) && ($args & PREG_FIND_DIRONLY));
if ($args & PREG_FIND_DIRONLY && $doadd && !is_dir($filepath)) $doadd = false;
if ($args & PREG_FIND_NEGATE) $doadd = !$doadd;
if ($doadd) {
if ($args & PREG_FIND_RETURNASSOC) { // return more than just the filenames
$fileres = array();
if (function_exists('stat')) {
$fileres['stat'] = stat($filepath);
$fileres['du'] = $fileres['stat']['blocks'] * 512;
}
if (function_exists('fileowner')) $fileres['uid'] = fileowner($filepath);
if (function_exists('filegroup')) $fileres['gid'] = filegroup($filepath);
if (function_exists('filetype')) $fileres['filetype'] = filetype($filepath);
if (function_exists('mime_content_type')) $fileres['mimetype'] = mime_content_type($filepath);
if (function_exists('dirname')) $fileres['dirname'] = dirname($filepath);
if (function_exists('basename')) $fileres['basename'] = basename($filepath);
if (($i=strrpos($fileres['basename'], '.'))!==false) $fileres['ext'] = substr($fileres['basename'], $i+1); else $fileres['ext'] = '';
if (isset($fileres['uid']) && function_exists('posix_getpwuid')) $fileres['owner'] = posix_getpwuid ($fileres['uid']);
$files_matched[$filepath] = $fileres;
} else
array_push($files_matched, $filepath);
}
}
if ( is_dir($filepath) && ($args & PREG_FIND_RECURSIVE) ) {
if (!is_link($filepath) || ($args & PREG_FIND_FOLLOWSYMLINKS))
$files_matched = array_merge($files_matched,
preg_find($pattern, $filepath, $args));
}
}
closedir($fh);
// Before returning check if we need to sort the results.
if (($depth==0) && ($args & (PREG_FIND_SORTKEYS|PREG_FIND_SORTBASENAME|PREG_FIND_SORTMODIFIED|PREG_FIND_SORTFILESIZE|PREG_FIND_SORTDISKUSAGE)) ) {
$order = ($args & PREG_FIND_SORTDESC) ? 1 : -1;
$sortby = '';
if ($args & PREG_FIND_RETURNASSOC) {
if ($args & PREG_FIND_SORTMODIFIED) $sortby = "['stat']['mtime']";
if ($args & PREG_FIND_SORTBASENAME) $sortby = "['basename']";
if ($args & PREG_FIND_SORTFILESIZE) $sortby = "['stat']['size']";
if ($args & PREG_FIND_SORTDISKUSAGE) $sortby = "['du']";
if ($args & PREG_FIND_SORTEXTENSION) $sortby = "['ext']";
}
$filesort = create_function('$a,$b', "\$a1=\$a$sortby;\$b1=\$b$sortby; if (\$a1==\$b1) return 0; else return (\$a1<\$b1) ? $order : 0- $order;");
uasort($files_matched, $filesort);
}
--$depth;
return $files_matched;
}
$files = preg_find($archive_people_files,$archive_dir, PREG_FIND_FULLPATH | PREG_FIND_RECURSIVE | PREG_FIND_RETURNASSOC | PREG_FIND_SORTMODIFIED | PREG_FIND_SORTBASENAME);
$files=array_keys($files);
/*
// Output each opened file and then close
foreach ($files as $filename) {
echo "<p>" . $filename;
}
*/
if (!($fp = fopen($output_file_name, 'w'))) {
return;
}
foreach ($files as $filename) {
preg_match('/Profiles.(.*).Archive.Messages.(.*).20......-/',$filename, $matches);
$account = $matches[1];
$to = $matches[1];
$contact = $matches[2];
$p = 0; // Position in the file's data.
$data = file_get_contents("${filename}");
fprintf($fp,"*** $filename\n\n");
while ($p < strlen($data)) {
$result = array();
$pieces = unpack('ltime/lunknown/ltoFrom/llength', substr($data, $p, $p + 16));
$p += 16;
$result['date'] = date('r', $pieces['time']);
$result['to'] = $pieces['toFrom'] ? $contact : $account;
if ($result['to'] == $from_name) {$result['to'] = $to_name;}
$cypherText = substr($data, $p, $pieces['length']);
$p += $pieces['length'];
// Generates the XOR key by repeating the account name
// then chopping off extra characters so the entire message
// can be XOR in one operation.
$key = str_repeat($to, ceil($pieces['length'] / strlen($to)));
$key = substr($key, 0, strlen($cypherText));
$result['text'] = $key ^ $cypherText;
// There seems to be an extra 4 bytes of junk here.
$p+=4;
// Nov 01, 2008 12:34 PersonTalking: That's what SHE said.
$new_text = preg_replace('/<font.*?Arial">/', '', $result['text'] );
$new_text = preg_replace('/...000000m/', '', $new_text );
$new_text = preg_replace('/<.font><.font>/', '', $new_text );
$new_text = preg_replace('/%/', '%%', $new_text );
$new_text = preg_replace('/\r/', "\n \t \t", $new_text );
$new_text = preg_replace('/\e\[.?2m/', "---- ", $new_text );
// echo "{$result['date']} {$result['to']}: {$new_text}\n<p>";
fprintf($fp, "{$result['date']}\t{$result['to']}:\t{$new_text}\n\n");
}
}
echo "done";
?>