我有一张 Excel 365 日志条目表。日志按开始标记和结束标记分组。
Column C: =IF( LEFT([@LOGS],6)="<Event","start", IF(LEFT([@LOGS],7)="</Event","end", ""))
原始数据:
TAG_START_END LOGS
start <Event time="Sat Apr 15 1:13:17.750" type="Debug" thread="2fec: Main CIM worker thread" elapsed="0"
Database: C:\2023 Files\GDBs_4\New Mobile Geodatabase.geodatabase
SQL: SELECT main.ACTIVE_TRANSPORTATION.OBJECTID,main.active_transportation_nt_flag_vw.flag FRO
end </Event>
start <Event time="Sat Apr 15 1:13:17.749" type="Debug" thread="2fec: Main CIM worker thread" elapsed="1"
end </Event>
start <Event time="Sat Apr 15 1:13:17.749" type="Debug" thread="2fec: Main CIM worker thread" elapsed="2"
Database: C:\2023 Files\GDBs_4\New Mobile Geodatabase.geodatabase
SQL: SELECT main.ACTIVE_TRANSPORTATION.OBJECTID,main.active_transportation_nt_flag_vw.flag FRO
Number of features returned: 100
end </Event>
start <Event time="Sat Apr 15 1:13:17.749" type="Debug" thread="2fec: Main CIM worker thread" elapsed="0"
end </Event>
start <Event time="Sat Apr 15 1:13:17.747" type="Debug" thread="2fec: Main CIM worker thread" elapsed="0"
Database: C:\2023 Files\GDBs_4\New Mobile Geodatabase.geodatabase
SQL: SELECT main.ACTIVE_TRANSPORTATION.OBJECTID,main.active_transportation_nt_flag_vw.flag FRO
end </Event>
start <Event time="Sat Apr 15 1:13:17.747" type="Debug" thread="2fec: Main CIM worker thread" elapsed="2"
Database: C:\2023 Files\GDBs_4\New Mobile Geodatabase.geodatabase
SQL: SELECT main.ACTIVE_TRANSPORTATION.OBJECTID,main.active_transportation_nt_flag_vw.flag FRO
Number of features returned: 100
end </Event>
start <Event time="Sat Apr 15 1:13:17.746" type="Debug" thread="2fec: Main CIM worker thread" elapsed="1"
end </Event>
start <Event time="Sat Apr 15 1:13:17.746" type="Debug" thread="2fec: Main CIM worker thread" elapsed="0"
end </Event>
start <Event time="Sat Apr 15 1:13:17.744" type="Debug" thread="2fec: Main CIM worker thread" elapsed="0"
Database: C:\2023 Files\GDBs_4\New Mobile Geodatabase.geodatabase
SQL: SELECT main.ACTIVE_TRANSPORTATION.OBJECTID,main.active_transportation_nt_flag_vw.flag FRO
end </Event>
start <Event time="Sat Apr 15 1:13:17.743" type="Debug" thread="2fec: Main CIM worker thread" elapsed="1"
end </Event>
对于每个标签组,我想要填充 GROUP_ID 列。
在上面的屏幕截图中,我已手动输入该字段中的值。现在,我想找到一种方法来使用公式或 Power Query 填充该字段。
我如何填充 GROUP_ID 列?
原因:我最终想找到一种方法将 LOGS 信息转换为列:https://i.stack.imgur.com/EfTUV.png。我认为 GROUP_ID 列会对此有所帮助。
有关的:
编辑:
使用 SQL 可以这样做:
Oracle SQL-根据开始/结束标记将 GROUP_ID 分配给行
sum(case when log_tags like '<Event%' then 1 else 0 end) over (order by id)
可以在 Excel 中做类似的事情吗?
答案1
使用 PowerQuery,如果结果定义明确,您可以定义AddGroup
和引用自身的步骤。引用此类递归定义时,每个步骤的名称中都需要一个。AddSeq
@
let
Source = Excel.CurrentWorkbook(){[Name="Table"]}[Content],
TypeChanged = Table.TransformColumnTypes(Source,{{"TAG_START_END", type text}, {"LOGS", type text}}),
IndexAdded = Table.AddIndexColumn(TypeChanged, "Index", 0, 1),
AddSeq = Table.AddColumn(IndexAdded, "Seq",
each if [TAG_START_END]="start" then 1 else @AddSeq[Seq]{[Index]-1} +1),
AddGroup = Table.AddColumn(AddSeq, "GroupID",
each if [TAG_START_END]="start"
then (if [Index]=0 then 1 else @AddGroup[GroupID]{[Index]-1}+1)
else @AddGroup[GroupID]{[Index]-1})
in
AddGroup
此处 IndexAdded 插入了需要引用表格上一个单元格的索引列({[Index]-1}
)
在这里,我使用了 C 列(开始/结束),但Text.StartsWith
可以用来计算 PowerQuery 中的开始条件。
可以增强脚本以对列进行重新排序,或者抑制索引列。
这个查询很容易写(如果我们知道表的递归构造),但效率不太高。
这是一个新的优化版本(2000 行需要 3 秒)。它直接获取一LOGS
列,验证它是否以 开头<Event
,如果是,则开始新的组序列。
let
Source = Excel.CurrentWorkbook(){[Name="Table"]}[Content],
TypeChanged = Table.TransformColumnTypes(Source,{{"LOGS", type text}}),
Lines = Table.ToRecords(TypeChanged),
Result = List.Accumulate(Lines, [table={}, group=0, seq=0],
(state, current) => if Text.StartsWith(current[LOGS],"<Event")
then let g = state[group]+1, s = 1 in
[ table= state[table] & {[group=g,seq=s] & current},
group=g, seq=s]
else let g = state[group], s = state[seq]+1 in
[ table= state[table] & {[group=g,seq=s] & current},
group=g, seq=s] ),
Table1 = Result[table],
Table2 = Table.FromRecords(Table1)
in
Table2