在大量 Excel 文件中搜索内容

在大量 Excel 文件中搜索内容

我有 2500 个 Excel 文件。我需要输出某一列中包含特定字符串的所有行。我该怎么做?如果特定字符串不在固定列中,而是可能在任意列中,该怎么办?

答案1

这是一些 skelet-O 代码。您可以在此基础上进行构建,也可以让其他人来构建。还有大量代码尚未编写。也许我回家后会写更多。

Option Explicit

Sub findInFolders()
    Dim folderName As String 'this is where all the files reside, some extra work is neede if there are sub directories
    'folderName = <put your folder name here>
    Dim files() As String: Set files = GetFolderContents
    Dim i As Integer

    Dim wb As Workbook, sht As Worksheet
    For i = LBound(files) To UBound(files)

        Set wb = Application.Workbooks.Open(files(i))
        For Each sht In wb.Sheets
            GetRowsBasedOnString searchString, sht
        Next sht

        wb.Close False
        Set wb = Nothing
    Next i
End Sub


Function GetFolderContents(folderName As String) As String()
    Dim fso As FileSystemObject: Set fso = New FileSystemObject
    GetFolderContents = fso.GetFolder(folderName).files
End Function

Function GetRowsBasedOnString(searchString As String, sht As Worksheet)
    'loop through range or use find or whatever. Find the value your looking for
    Dim found As Boolean, rng As Range
    If found Then ReportFoundRow rng
End Function

Function ReportFoundRow(foundRow As Range)
    'write your found data to your master workbook
End Function

答案2

从你的评论来看,我猜你从未写过 VBA 宏。你的第一个宏会像爬山一样难,但之后的每个宏都会变得更容易,直到你忘记你曾经认为它们可能很难编写。

下面的宏假设所有 2,500 个工作簿都位于同一个文件夹中。这通常是最简单的方法,但在您的案例中可能行不通。如果行不通,请选择包含大量工作簿的文件夹来试用此宏。您必须在问题中添加对您的情况的解释,以便我可以解释如何调整此宏来解决这个问题。

我试图让事情变得简单,尽管看起来可能并非如此。有更好、更快的方法可以做同样的事情,但我认为这是正确的妥协。我添加了很多注释来解释代码的作用。宏编辑器的帮助将解释语法。但如果你遇到困难,一定要问。

在您为测试选择的文件夹中创建一个新工作簿。我的代码需要一个名为“Bobert”的工作表,这对我来说很方便。选择一个对您有意义的名称并更改代码以匹配;我稍后会告诉您如何操作。

选择Tools然后Macro然后Visual Basic Editor或单击Alt+ F11

左下方是项目浏览器。右上方是灰色区域。右下方是即时窗口。我稍后会谈到即时窗口。

Insert然后选择Module。“Module1”将添加到项目资源管理器中,灰色区域将变为白色。这是 Module1 的代码区域。

您可以将模块名称保留为“Module1”,也可以更改它。单击 F4。将显示“属性”窗口。模块的唯一属性是其名称。单击“(Name) Module1”中的“Module 1”,退出“Module1”,然后输入您选择的名称。关闭“属性”窗口。

将下方代码复制到代码区。

此宏解决了问题的第一部分:它查找文件夹中的所有工作簿以及这些工作簿中的所有工作表。它在工作表“Bobert”中创建这些工作表的列表。如果无法将 2,500 个工作簿合并到一个文件夹中,您可能需要像这样的宏来构建要检查的工作簿和工作表的列表,但此宏旨在作为培训练习。创建标题行:

 A1 = Folder
 B1 = Workbook
 C1 = Worksheet

您唯一需要立即更改的声明是:

  Set WShtDest = ActiveWorkbook.Worksheets("Bobert")

将“Bobert”更改为您为将在其中创建工作表列表的工作表选择的名称。

将光标放在语句上:

    RowDestCrnt = .Cells(Rows.Count, "A").End(xlUp).Row + 1

并单击 F9。该行将变为棕色,因为您已将其设置为断点,稍后我将对此进行解释。

每次单击 F8 时,都会执行代码的一个语句。这样您就可以逐步执行代码。如果将光标放在变量名称上,将显示其值。您可以切换到工作表来检查发生了什么变化。

如果您认为您理解了某个代码块,请单击 F5,代码将运行到下一个断点。我设置了一个断点,但您可以根据需要设置任意多个。

我希望这能给你一些思考。回答我的问题,我可以给你下一个解决方案。

Option Explicit
' Searching for content in a large number of Excel files
' http://superuser.com/q/452980/108084
Sub ListWorksheets()

  Dim ColDestCrnt As Long
  Dim FileNameList() As String
  Dim InxFNL As Long
  Dim InxW As Long
  Dim PathCrnt As String
  Dim RowDestCrnt As Long
  Dim WBkSource As Workbook
  Dim WShtDest As Worksheet

  Application.ScreenUpdating = False

  ' Create a reference to the worksheet in which data will be stored
  ' Change "Bobert" to your name for the destination worksheet.
  Set WShtDest = ActiveWorkbook.Worksheets("Bobert")

  ' This assumes the source workbooks are in the same folder as the workbook
  ' holding this macro.  You could replace this with a statement like:
  '   PathCrnt = "C:\MyFiles"
  PathCrnt = ActiveWorkbook.Path

  ' GetFileNameList is a subroutine I wrote sometime ago.  It returns an
  ' array of filenames within a specified folder (PathCrnt) that match a
  ' specified format (*.xls).
  Call GetFileNameList(PathCrnt, "*.xls", FileNameList)

  ' Get the next free row in worksheet Bobert.  By calling this routine with
  ' different values for PathCrnt, you could built up a list containing files
  ' from many folders.
  With WShtDest
    RowDestCrnt = .Cells(Rows.Count, "A").End(xlUp).Row + 1
  End With

  For InxFNL = LBound(FileNameList) To UBound(FileNameList)
    If FileNameList(InxFNL) <> ActiveWorkbook.Name Then
      ' In the Workbook Open statement, 0 means "do not update any links" and
      ' True means "open read only"
      Set WBkSource = Workbooks.Open(PathCrnt & "\" & FileNameList(InxFNL), 0, True)
      With WBkSource
        ' Record the name of each worksheet in the workbook
        For InxW = 1 To .Worksheets.Count
          WShtDest.Cells(RowDestCrnt, "A").Value = PathCrnt
          WShtDest.Cells(RowDestCrnt, "B").Value = FileNameList(InxFNL)
          WShtDest.Cells(RowDestCrnt, "C").Value = .Worksheets(InxW).Name
          RowDestCrnt = RowDestCrnt + 1
        Next
        .Close SaveChanges:=False     ' Close this source workbook
      End With
    End If
  Next

End Sub
Sub GetFileNameList(ByVal PathCrnt As String, ByVal FileSpec As String, _
                                            ByRef FileNameList() As String)

' This routine sets FileNameList to the names of files within folder
' PathCrnt that match FileSpec.  It uses function Dir$() to get the file names.
' I can find no documentation that says Dir$() gets file names in alphabetic
' order but I have not seen a different sequence in recent years.

  Dim AttCrnt As Long
  Dim FileNameCrnt As String
  Dim InxFNLCrnt As Long

  ' I initialise the array with space for 100 files and then enlarge it if
  ' necessary.  This is normally enough space for  my applications but since
  ' you are expecting 2,500 files I have replaced the original statement.
  'ReDim FileNameList(1 To 100)
  ReDim FileNameList(1 To 2500)
  InxFNLCrnt = 0

  ' Ensure path name ends in a "\"
  If Right(PathCrnt, 1) <> "\" Then
    PathCrnt = PathCrnt & "\"
  End If

  ' This Dir$ returns the name of the first file in
  ' folder PathCrnt that matches FileSpec.
  FileNameCrnt = Dir$(PathCrnt & FileSpec)
  Do While FileNameCrnt <> ""
    ' "Files" have attributes, for example: normal, to-be-archived, system,
    ' hidden, directory and label. It is unlikely that any directory will
    ' have an extension of XLS but it is not forbidden.  More importantly,
    ' if the files have more than one extension so you have to use "*.*"
    ' instead of *.xls", Dir$ will return the names of directories. Labels
    ' can only appear in route directories and I have not bothered to test
    ' for them
    AttCrnt = GetAttr(PathCrnt & FileNameCrnt)
    If (AttCrnt And vbDirectory) <> 0 Then
      ' This "file" is a directory.  Ignore
    Else
      ' This "file" is a file
      InxFNLCrnt = InxFNLCrnt + 1
      If InxFNLCrnt > UBound(FileNameList) Then
        ' There is a lot of system activity behind "Redim Preserve".  I reduce
        ' the number of Redim Preserves by adding new entries in chunks and
        ' using InxFNLCrnt to identify the next free entry.
        ReDim Preserve FileNameList(1 To 100 + UBound(FileNameList))
      End If
      FileNameList(InxFNLCrnt) = FileNameCrnt
    End If
    ' This Dir$ returns the name of the next file that matches
    ' the criteria specified in the initial call.
    FileNameCrnt = Dir$
  Loop

  ' Discard the unused entries
  ReDim Preserve FileNameList(1 To InxFNLCrnt)

End Sub

相关内容