电话短信 xml 到 latex 中的盒装对话

电话短信 xml 到 latex 中的盒装对话

这是 Android 应用程序生成的用于备份短信的 xml 文件的一部分。

<sms protocol="0" address="+1234567890" date="1602132754403" type="1" subject="null" body="Hi, this is first input. ok? " toa="null" sc_toa="null" service_center="+998877665544" read="1" status="-1" locked="0" date_sent="1602132750000" sub_id="1" readable_date="8 oct. 2020 10:22:34" contact_name="yaya" />

我希望从这个文件中原封不动地获得一个方框对话,其中一个成员的输入在左边,另一个在右边。使用 tcolorbox,我希望标题包含“地址”的值,即发件人的电话号码。在方框内,源文件中每行开头和结尾的所有技术细节都必须用小字体放在顶部,底部必须是聊天内容,字体为正常字体。我不知道是否也可以使用人类可读的日期格式。

更新

以下是一个较长的 xml 文件示例:

<!--File Created By SMS Backup & Restore v10.08.006 on 12/11/2020 18:37:07-->
<!--

To view this file in a more readable format, visit https://synctech.com.au/view-backup/

-->
<smses count="18" backup_set="10d36e93-2204-ba47-c5b3-cef114e63e22" backup_date="1605186427168" type="full">
  <sms protocol="0" address="+112233344455" date="1581427096346" type="1" subject="null" body="Tomorrow positively" toa="null" sc_toa="null" service_center="+1627384950" read="1" status="-1" locked="0" date_sent="1581427095000" sub_id="1" readable_date="11 feb. 2020 18:48:16" contact_name="best friend" />
  <sms protocol="0" address="+22333 44455" date="1582198718331" type="2" subject="null" body="hello friend, let me know if you're available this weekend..." toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" date_sent="0" sub_id="1" readable_date="20 feb. 2020 17:08:38" contact_name="best friend" />
  <sms protocol="0" address="+112233344455" date="1582199748809" type="1" subject="null" body="Yes you can please come this weekend" toa="null" sc_toa="null" service_center="+1627384950" read="1" status="-1" locked="0" date_sent="1582199734000" sub_id="1" readable_date="20 feb. 2020 17:25:48" contact_name="best friend" />
  <sms protocol="0" address="+22333 44455" date="1582347025313" type="2" subject="null" body="sunday around 3 pm?" toa="null" sc_toa="null" service_center="null" read="1" status="0" locked="0" date_sent="1582347033917" sub_id="1" readable_date="22 feb. 2020 10:20:25" contact_name="best friend" />
  <sms protocol="0" address="BT-CTOPUP" date="1585539138933" type="1" subject="null" body="Recharge with $4.85 by 1928374656 on 30/03/2020@08:52:59 AM,GST $7.98 given in main balance,valid till 27/06/20, CB $19.70 Ref.8475601927" toa="null" sc_toa="null" service_center="+578493098475" read="1" status="-1" locked="0" date_sent="1585538753000" sub_id="1" readable_date="30 mars 2020 09:02:18" contact_name="(Unknown)" />
  <sms protocol="0" address="BT-CTOPUP" date="1585884332885" type="1" subject="null" body="Recharge with $16 by 45637284 on 03/04/2020@08:52:05 AM,GST $2.44 given in main balance,valid till 27/06/20, CB $19 Ref.625378901" toa="null" sc_toa="null" service_center="+578493098475" read="1" status="-1" locked="0" date_sent="1585884336000" sub_id="1" readable_date="3 apr. 2020 08:55:32" contact_name="(Unknown)" />
  <sms protocol="0" address="+162738495078" date="1591018139214" type="1" subject="null" body="Hi there, are you ready? " toa="null" sc_toa="null" service_center="+17283940567" read="1" status="-1" locked="0" date_sent="1591018163000" sub_id="1" readable_date="1 june 2020 18:58:59" contact_name="another best friend" />
  <sms protocol="0" address="27384 95078" date="1591019084514" type="2" subject="null" body="yes" toa="null" sc_toa="null" service_center="null" read="1" status="68" locked="0" date_sent="1591106098069" sub_id="1" readable_date="1 june 2020 19:14:44" contact_name="another best friend" />
  <sms protocol="0" address="BT-CTOPUP" date="1593410081824" type="1" subject="null" body="Recharge with $6 by 45637284 on 29/06/2020@11:25:21 AM,GST $1.59 given in main balance,valid till 09/11/20, CB $13.5 Ref.4430684" toa="null" sc_toa="null" service_center="+578493098475" read="1" status="-1" locked="0" date_sent="1593410112000" sub_id="1" readable_date="29 june 2020 11:24:41" contact_name="(Unknown)" />
  <sms protocol="0" address="27384 95078" date="1593648493160" type="2" subject="null" body="Hot summer is here." toa="null" sc_toa="null" service_center="null" read="1" status="0" locked="0" date_sent="1593648500171" sub_id="1" readable_date="2 jul. 2020 05:38:13" contact_name="another best friend" />
  <sms protocol="0" address="+162738495078" date="1593656855536" type="1" subject="null" body="OK will come today" toa="null" sc_toa="null" service_center="+17283940567" read="1" status="-1" locked="0" date_sent="1593654668000" sub_id="1" readable_date="2 jul. 2020 07:57:35" contact_name="another best friend" />
  <sms protocol="0" address="+162738495078" date="1594438358421" type="1" subject="null" body="hi 30 coins for you" toa="null" sc_toa="null" service_center="+17283940567" read="1" status="-1" locked="0" date_sent="1594438416000" sub_id="1" readable_date="11 jul. 2020 09:02:38" contact_name="another best friend" />
  <sms protocol="0" address="+162738495078" date="1594654966188" type="1" subject="null" body="Can you give half to jerry? " toa="null" sc_toa="null" service_center="+17283940567" read="1" status="-1" locked="0" date_sent="1594655018000" sub_id="1" readable_date="13 jul. 2020 21:12:46" contact_name="another best friend" />
  <sms protocol="0" address="27384 95078" date="1598325648534" type="2" subject="null" body="hi, long time we last met!" toa="null" sc_toa="null" service_center="null" read="1" status="0" locked="0" date_sent="1598325653623" sub_id="1" readable_date="25 aug. 2020 08:50:48" contact_name="another best friend" />
  <sms protocol="0" address="+162738495078" date="1598328721739" type="1" subject="null" body="Indeed, let's meet" toa="null" sc_toa="null" service_center="+17283940567" read="1" status="-1" locked="0" date_sent="1598328711000" sub_id="1" readable_date="25 aug. 2020 09:42:01" contact_name="another best friend" />
  <sms protocol="0" address="27384 95078" date="1598333198381" type="2" subject="null" body="I'll be out and i'll be back in 2 hrs max." toa="null" sc_toa="null" service_center="null" read="1" status="0" locked="0" date_sent="1598333206580" sub_id="1" readable_date="25 aug. 2020 10:56:38" contact_name="another best friend" />
  <sms protocol="0" address="+162738495078" date="1598333470558" type="1" subject="null" body="I'm not coming now" toa="null" sc_toa="null" service_center="+17283940567" read="1" status="-1" locked="0" date_sent="1598333469000" sub_id="1" readable_date="25 aug. 2020 11:01:10" contact_name="another best friend" />
</smses>

在输出中,我希望只显示 6 月 1 日以来来自“另一个最好的朋友”的条目,其类型也是“1”,以及同一日期的所有类型为“2”的条目。因此,在本例中,不应显示来自“最好的朋友”和“未知”的所有条目。

更新 2

我已经能够实现一个函数来将 unix 时间戳转换为可读格式。

在答案中提供的文件中的 sms.sty 我在 luacode 部分添加了以下内容:

  function epoch (format,time)
  if format == 1 then
    fmt = "%c"
-- below division is for the case time is given in ms instead of sec. And
-- 'time' must be an integer, thus the use of math.floor()
    time = math.floor( time / 1000 )

  elseif format == 2 then
    fmt = "%A"
  elseif format == 3 then
    fmt = "%B"
  elseif format == 4 then
    fmt = "%X"
  else
    fmt = "%x"
  end
  tex.sprint(os.date(fmt, time))
  end

然后在 smslib.lua 中,函数 epoch 在双方括号部分使用如下: \epoch{1}{@{'date_sent'}}

它运行良好,但据我所知,我在某处发现的 epoch 函数应该改进。

更新 3

这是针对我的请求的解决方案。smslib.lua

-- instead of :
--transform.add_action("sms[type='2']", me_template)
--transform.add_action("sms[type='1']", other_template)

-- do this :
local proc_me = transform.simple_content(me_template)
local proc_other = transform.simple_content(other_template)

local cutoff_date = 1000 * os.time{ year = 2020, month = 10, day = 7 }

transform.add_custom_action("sms[type='2'][address='98407 47754']", 
function(e) if tonumber(e:get_attribute("date")) >= cutoff_date then 
  return proc_me(e) end end )

transform.add_custom_action("sms[type='1'][address='+919840747754']", 
function(e) if tonumber(e:get_attribute("date")) >= cutoff_date then 
  return proc_other(e) end end )

更新 4 这是最终代码的图形输出,第一个 tcolorbox 是使用 tcb 环境 senderbox 从 sms.sty 创建,该环境由 michal.h21 提供的解决方案提供,从文件 sample.tex 调用。标题和内容都是在 sample.tex 中手写的。第二个框是由 senderbox 创建的,从 smslib.lua 调用,其中 other_template =[[\bein{senderbox}[...]{...}@{body}\end{senderbox}]]。无论是使用从 sms.xml 文件读取的内容,还是手动写入 @{body},结果都是相同的。我添加了一条垂直线来突出显示框内容在这种情况下出现的缩进。 在此处输入图片描述

答案1

** 编辑 **

OP 有很多额外的要求,这是更新后的代码。

附加功能会在 XML DOM 对象转换为 LaTeX 代码之前对其进行过滤。具体来说,它会按日期或发件人过滤消息。更新后的 Lua 库smslib.lua现在如下所示:

kpse.set_program_name "luatex"
local domobject = require "luaxml-domobject"

local transform = require "luaxml-transform"

-- module
local M = {}

-- templates for SMS print
-- template for user
local me_template = [[
\begin{mebox}[]{Me}
{\tiny subject:@{subject}, read: @{read}}\\@{readable_date}\tcblower{}@{body}
\end{mebox}
]]

-- template for the person who messages
local other_template = [[
\begin{senderbox}[]{@{contact_name} $\langle$@{address}$\rangle$}
{\tiny subject:@{subject}, read: @{read}}\\@{readable_date}\tcblower{}@{body}
\end{senderbox}

]]

transform.add_action("sms[type='2']", me_template)
transform.add_action("sms[type='1']", other_template)

M.load_xml = function(filename)
  local f, message = io.open(filename, "r")
  if not f then 
    print("XML file error: ", message)
    return nil, message
  end
  local content = f:read("*all")
  f:close()
  content = content:gsub("\r", "")
  local dom = domobject.parse(content)
  return dom
end

local function date_to_sms_time(date)
  local year, month, day = date:match("(%d+)%-(%d+)%-(%d+)")
  if not year then return nil, "Cannot parse date" end
  -- SMS XML dates are multiplied by 1000.
  local base_time = os.time({year = tonumber(year), month = tonumber(month), day = tonumber(day)})
  return  1000 * base_time -- - 24 * 3600
end

-- process all records and execute test function
M.filter = function(dom, fn)
  for _, rec in ipairs(dom:query_selector("sms")) do
    -- if test function returns true, the record will be removed
    local status = fn(rec)
    if status then 
      rec:remove_node()
    end
  end
end

M.from_date = function(dom, date)
  local startday, msg = date_to_sms_time(date)
  if not startday then return nil, msg end
  M.filter(dom, function(rec)
    local date = tonumber(rec:get_attribute("date_sent") or "")
    return date < startday 
  end)
end

-- set international phone number prefix
M.set_number_prefix = function(prefix)
  M.prefix = prefix
end

-- compare two phone numbers. we must normalize them
-- by removing spaces and adding international prefix
-- if it is missing in one of numbers
M.match_numbers = function(first, second)
  local prefix = M.prefix or ""
  local normalize = function(number)
    if not number:match("^%+") then number = prefix .. number end
    return number:gsub("%s", "")
  end
  return normalize(first) == normalize(second)
end

  


M.filter_sender = function(dom, sender)
  if sender == "" then return nil, "empty sender" end
  M.filter(dom, function(rec)
    local typ = rec:get_attribute("type") 
    local address = rec:get_attribute("address")
    -- remove all sms that are not part of the conversation with this number
    return M.match_numbers(address, sender) ~= true
  end)
end




M.process_xml = function(dom)
  local converted = transform.process_dom(dom)
  tex.print(converted)
end

-- M.process_xml("messages.xml")

return M

过滤函数循环遍历所有短信,并使用作为参数的函数测试它们。如果函数测试结果为正,则会删除该消息。有两个过滤函数,一个删除早于某个日期的消息,另一个删除不是来自号码的传入文本。

我们还需要更新sms.sty包,因为我们希望能够将日期和发件人号码传递给过滤器:

\ProvidesPackage{sms}
\RequirePackage{tcolorbox}
\RequirePackage{xkeyval}
\RequirePackage{luacode}
\begin{luacode*}
smslib = require "smslib"
\end{luacode*}

% settings for the boxes
% modify the values to your liking
\newtcolorbox{mebox}[2][]{%
colback=blue!10!white,colframe=red!70!black,
halign title=right,
title={#2},fonttitle=\bfseries,#1}

\newtcolorbox{senderbox}[2][]{%
colback=blue!10!white,colframe=blue!70!black,
title={#2},fonttitle=\bfseries,#1}

\def\printsms@fromdate{}
\def\printsms@sender{}
% international phone number prefix. this is the default value
% you can change it using the prefix key in \printsms
\def\printsms@prefix{+16}
% some keyval attributes
\define@key{printsms}{fromdate}[]{\def\printsms@fromdate{#1}}
\define@key{printsms}{sender}[]{\def\printsms@sender{#1}}
\define@key{printsms}{prefix}[+16]{\def\printsms@prefix{#1}}


% Command that will process the SMS XML file and print the boxes
\newcommand\printsms[2][]{%
 \bgroup%
 \setkeys{printsms}{#1}%
 \directlua{%
  local dom = smslib.load_xml("#2")
  smslib.set_number_prefix("\printsms@prefix")
  smslib.from_date(dom,"\printsms@fromdate")
  smslib.filter_sender(dom, "\printsms@sender")
  smslib.process_xml(dom)
  }%
  \egroup%
}

\endinput

现在\printsms需要一些 keyval 选项,可以像这样使用:

\documentclass{article}
\usepackage{sms}
\begin{document}
\printsms[fromdate=2020-06-01,sender=+162738495078]{newmessages.xml}
\end{document}

结果如下:

在此处输入图片描述

原始答案:

使用 LuaXML 可以很容易地做到这一点,特别是使用新luaxml-transform库。该库可以使用 CSS 选择器和简单模板将 XML 文件转换为任何输出格式。

我将提供一个特殊的 LuaLaTeX 包来处理此任务。它由两个文件组成,sms.stypackage(包含 TeX 声明)和smslib.lua(执行转换)。

这里是sms.sty

\ProvidesPackage{sms}
\RequirePackage{tcolorbox}
\RequirePackage{luacode}
\begin{luacode*}
smslib = require "smslib"
\end{luacode*}

% settings for the boxes
% modify the values to your liking
\newtcolorbox{mebox}[2][]{%
colback=blue!10!white,colframe=red!70!black,
halign title=right,
title={#2},fonttitle=\bfseries,#1}

\newtcolorbox{senderbox}[2][]{%
colback=blue!10!white,colframe=blue!70!black,
title={#2},fonttitle=\bfseries,#1}

% Command that will process the SMS XML file and print the boxes
\newcommand\printsms[1]{%
 \directlua{%
  smslib.process_xml("#1")
  }%
}

\endinput

它加载tcolorbox并定义两种新的框类型,mebox将用于打印来自您的消息,senderbox将用于接收消息。您可以在此处更改颜色和其他格式详细信息。

\printsms命令获取 XML 文件的名称并将其传递给smslib.lua库,如下所示:

local domobject = require "luaxml-domobject"
local transform = require "luaxml-transform"

-- module
local M = {}

-- templates for SMS print
-- template for user
local me_template = [[
\begin{mebox}[]{Me}
{\tiny subject:@{subject}, read: @{read} }\\
@{readable_date}
\tcblower
@{body}
\end{mebox}
]]

-- template for the person who messages
local other_template = [[
\begin{senderbox}[]{@{contact_name} $\langle$@{address}$\rangle$}
{\tiny subject:@{subject}, read: @{read} }\\
@{readable_date}
\tcblower
@{body}
\end{senderbox}

]]

transform.add_action("sms[type='2']", me_template)
transform.add_action("sms[type='1']", other_template)


M.process_xml = function(filename)
  local f, message = io.open(filename, "r")
  if not f then 
    print("XML file error: ", message)
    return nil, message
  end
  local content = f:read("*all")
  f:close()
  content = content:gsub("\r", "")
  local dom = domobject.parse(content)
  local converted = transform.process_dom(dom)
  tex.print(converted)
end



return M

重要的部分是这样的:

local me_template = [[
\begin{mebox}[]{Me}
{\tiny subject:@{subject}, read: @{read} }\\
@{readable_date}
\tcblower
@{body}
\end{mebox}
]]

-- template for the person who messages
local other_template = [[
\begin{senderbox}[]{@{contact_name} $\langle$@{address}$\rangle$}
{\tiny subject:@{subject}, read: @{read} }\\
@{readable_date}
\tcblower
@{body}
\end{senderbox}

]]

transform.add_action("sms[type='2']", me_template)
transform.add_action("sms[type='1']", other_template)

转换库处理 XML 文件并测试所有元素是否执行使用 声明的操作add_action。第一个参数是用于匹配的 CSS 选择器,第二个参数是转换模板。

这里声明了两个动作,它们都是针对<sms>元素的。我发现导出的 SMS XML 使用属性来区分传出消息和传入消息type。传入消息有值1,传出消息有值2[type='1']测试这些属性,并根据其值调用正确的模板。

me_template和变量other_template包含将打印消息的字符串模板。@{attribute_name}字符串可用于打印<sms>元素属性的值。例如,@{body}输出消息文本。您可以编辑模板以包含所需的更多属性。

我找到了 XML 文件的完整示例。我还将您的示例添加到此文件中messages.xml

<smses count="3">
<sms protocol="0" address="+1234567890" date="1602132754403" type="1" subject="null" body="Hi, this is first input. ok? " toa="null" sc_toa="null" service_center="+998877665544" read="1" status="-1" locked="0" date_sent="1602132750000" sub_id="1" readable_date="8 oct. 2020 10:22:34" contact_name="yaya" />
<sms protocol="0" address="332" date="1285799668193" type="2" subject="null" body="Sample Message Sent from the phone" toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" readable_date="Sep 30, 2010 8:34:28 AM" contact_name="(Unknown)" />
<sms protocol="0" address="4433221123" date="1289643415810" type="1" subject="null" body="Sample Message received by the phone" toa="null" sc_toa="null" service_center="null" read="0" status="-1" locked="0" readable_date="Nov 13, 2010 9:16:55 PM" contact_name="(Unknown)" />
</smses>

以下是 TeX 文件示例:

\documentclass{article}
\usepackage{sms}
\begin{document}
\printsms{messages.xml}
\end{document}

最终的 PDF 结果如下:

在此处输入图片描述

相关内容