我有一个包含 2 列的文本文件，我需要将第 2 列中的所有行合并到下一列 1 的属性 ID 之前

Question 1

这可以使用 Windows Excel 2010+ 和 Excel 365（Windows 或 Mac）中的 Power Query 来实现

使用 Power Query

使用第一行的代码作为如何将文本文件读入 Power Query 的示例
- 就我而言，我将提取的文件存储在桌面上
`数据 => 获取并转换 => 来自文本/CSV
当 PQ 编辑器打开时：Home => Advanced Editor
记下第 2 行中的路径
将下面的 M 代码粘贴到您所看到的位置
将第 2 行的路径改回最初生成的路径。
阅读评论并探索Applied Steps以了解算法

let

//Use non-existent character for delimiter to keep from splitting anything
    Source = Csv.Document(File.Contents("C:\Users\ron\Desktop\explanatoryNotes.txt"),
        [Delimiter=Character.FromNumber(1), Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Removed Blank Rows" = Table.SelectRows(Source, each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Blank Rows",{{"Column1", "Explanatory Notes"}}),
    #"Removed Top Rows" = Table.Skip(#"Renamed Columns",1),
    #"Added Custom" = Table.AddColumn(#"Removed Top Rows", "Custom", each 
    
        //replace first space with a SOH
        let 
            noteList = Text.ToList([Explanatory Notes]),
            firstSpace = List.PositionOf(noteList," ",Occurrence.First),
            replWithPlaceHolder = List.ReplaceRange(noteList,firstSpace,1,{Character.FromNumber(1)}),
            theString = Text.Combine(replWithPlaceHolder)
        in 
            theString),

//Remove original column, then split on that first SOH
//should leave a space in first column which can be replaced with null
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Explanatory Notes"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "Custom", Splitter.SplitTextByEachDelimiter({Character.FromNumber(1)}, 
        QuoteStyle.Csv, false), {"Attribute ID","Explanatory Note"}),

//Trim any leading (and trailing) spaces
    #"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{
            {"Attribute ID", Text.Trim, type text}, {"Explanatory Note", Text.Trim, type text}}),

//Replace empty cells in Attribute Column with Nulls to enable fill down
    addNulls = Table.TransformColumns(#"Trimmed Text",{{"Attribute ID", each if Text.Length(_)=0 then null else _, type text}}),
    #"Filled Down" = Table.FillDown(addNulls,{"Attribute ID", "Explanatory Note"}),

//Group by ID (with groupkind.local in case there are separate identical groups)
//  and concatenate the Note lines
    #"Grouped Rows" = Table.Group(#"Filled Down", {"Attribute ID"}, {
        {"Explanatory Notes", each Text.Combine([Explanatory Note]," "), type text}})
in
    #"Grouped Rows"

输出一些行

Answer

这可以使用 Windows Excel 2010+ 和 Excel 365（Windows 或 Mac）中的 Power Query 来实现

使用 Power Query

使用第一行的代码作为如何将文本文件读入 Power Query 的示例
- 就我而言，我将提取的文件存储在桌面上
`数据 => 获取并转换 => 来自文本/CSV
当 PQ 编辑器打开时：Home => Advanced Editor
记下第 2 行中的路径
将下面的 M 代码粘贴到您所看到的位置
将第 2 行的路径改回最初生成的路径。
阅读评论并探索Applied Steps以了解算法

let

//Use non-existent character for delimiter to keep from splitting anything
    Source = Csv.Document(File.Contents("C:\Users\ron\Desktop\explanatoryNotes.txt"),
        [Delimiter=Character.FromNumber(1), Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Removed Blank Rows" = Table.SelectRows(Source, each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Blank Rows",{{"Column1", "Explanatory Notes"}}),
    #"Removed Top Rows" = Table.Skip(#"Renamed Columns",1),
    #"Added Custom" = Table.AddColumn(#"Removed Top Rows", "Custom", each 
    
        //replace first space with a SOH
        let 
            noteList = Text.ToList([Explanatory Notes]),
            firstSpace = List.PositionOf(noteList," ",Occurrence.First),
            replWithPlaceHolder = List.ReplaceRange(noteList,firstSpace,1,{Character.FromNumber(1)}),
            theString = Text.Combine(replWithPlaceHolder)
        in 
            theString),

//Remove original column, then split on that first SOH
//should leave a space in first column which can be replaced with null
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Explanatory Notes"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "Custom", Splitter.SplitTextByEachDelimiter({Character.FromNumber(1)}, 
        QuoteStyle.Csv, false), {"Attribute ID","Explanatory Note"}),

//Trim any leading (and trailing) spaces
    #"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter",{
            {"Attribute ID", Text.Trim, type text}, {"Explanatory Note", Text.Trim, type text}}),

//Replace empty cells in Attribute Column with Nulls to enable fill down
    addNulls = Table.TransformColumns(#"Trimmed Text",{{"Attribute ID", each if Text.Length(_)=0 then null else _, type text}}),
    #"Filled Down" = Table.FillDown(addNulls,{"Attribute ID", "Explanatory Note"}),

//Group by ID (with groupkind.local in case there are separate identical groups)
//  and concatenate the Note lines
    #"Grouped Rows" = Table.Group(#"Filled Down", {"Attribute ID"}, {
        {"Explanatory Notes", each Text.Combine([Explanatory Note]," "), type text}})
in
    #"Grouped Rows"

输出一些行

Question 2

只是为了提供替代解决方案，这里介绍如何使用 Linux shell 和提取此文cat tr awk本sed：

准备索引文件：

cat explanatoryNotes.txt|grep -v ^' '|awk '{print $1}' > indexes

从文本文件中删除所有新行：

cat explanatoryNotes.txt|tr -d '\n' > explanatoryNotes.txt-no_newlines

搜索并用索引+换行符替换所有索引：

for i in `cat indexes`;do sed -i 's/'"\\${i} "'/'"\\${nl}${i} "'/g' explanatoryNotes.txt-no_newlines;done

此文本操作期望索引字符组仅出现在行首（而不是文本内）才能正常工作。

Answer