SSGREP

SSGREP is a tool for searching files in your version-control repository. While your version-control system may or may not offer a search facility, SSGREP is likely to be more powerful, since it gives you the full power of Perl's regular expressions. Furthermore, by default, SSGREP strips all comments from the code when searching for known file types, so that you only get matches from program code. SSGREP can produce the output as plain text, HTML or save it to a database.

Contents:
   Command-line syntax
   Specifying input
   Specifying Search Patterns
   Search Behaviour
   Output Format
      Using ‑window
      Using ‑crossref
   Using a Database for the Output
      Tables Created by SSGREP
      Resuming a Search
      Notes on Security

Command-line syntax

ssgrep -config config-file | -VC VC-path [-type type ... ] [-lang VB|SQL|C ...] 
       [-noignorecase] [-comments] 
       [-HTML] [-crossref] [-window n] 
       [-database database [[-Server Server] [-User User] [-Password Pwd]]
        -input file | -resume | searchstr1 [searchstr2 ...] 
Rather than supplying a complete table with all options, I describe them group by group.

Specifying input

To specify which files SSGREP is to search you must specify one of ‑config or ‑VC. Use ‑VC when you want to search a single VC-path. If you want to search multiple paths, use a config-file with the ‑config option. No matter which option you use, SSGPEP always searches the most recently checked-in version of each file. Thus any explicit label in the config-file is ignored.

SSGREP ignores these files:

You can use the options ‑type and ‑lang to restrict SSGREP to only search certain files. Both these options can appear multiple times on the command line. With ‑type you specify a file extension, with or with out the leading period. For instance, ‑type .sp ‑type TRI instructs SSGREP to only search stored procedure and trigger files. ‑lang is a shortcut to ‑type that permits you to specify all types for a certain programming language in one go. The following values are recognised:
SQL – The SQL files according to the AbaPerls SQL directory structure.
VB – Files of the types .bas, .cls, .frm and .vb.
C – Files of the types .c, .cs, .cpp, .h and .hpp.

Specifying Search Patterns

The search patterns for SSGREP are regular expressions according to the rules of Perl. The manual page for SSREPLACE includes a crash-course in regular expressions. Note that you cannot search for matches that span multiple lines, as SSGREP match each line separately against the search pattern.

If you supply multiple search strings, SSGREP joins them to a single regular expression with the | (OR) operator. That is:

SSGREP -VC $/MyProject this_string that_string yet_another_string

is the same as

SSGREP -VC $/MyProject "this_string|that_string|yet_another_string"

Tip: if you want to search for a string and only want to get hits on whole words only, embed the string in \b. That is, if you say

SSGREP -VC $/MyProject trip

This will list lines that include words like strip or tripping, whereas

SSGREP -VC $/MyProject \btrip\b

will only list matches with trip as such.

You can specify the input in three different ways. Most of the time you will use the command line, but if you have very many search strings, or you have problems with characters that are special to the DOS command line you can specify an input file with the ‑input option. In the file you can specify multiple search patters and SSGREP will join them with the | operator to a singular regular expression. SSGREP strips leading and trailing speces from the lines and ignores blank lines. There is no provison for comments in the file.

The third way to specify is the ‑resume option, which you only can use when you also specify the ‑database option. See further below about using a database.

Search Behaviour

There are two options to control how SSGREP performs the search. By default, SSGREP performs a case-insensitive search after having stripped all comments from the files. More precisely, SSGREP strips comments from files of which AbaPerls understands the format. These are the same file types as those listed for the ‑lang option above.

Use the option ‑noignorecase to perform a case-sensitive search.

Use the option ‑comments to also include comments in the search.

Output Format

You can choose between three different formats: plain text, HTML and database. I discuss the database format in a separate section.

Plain text is the default. For instance.this search:

ssgrep -VC data6/$/abaperls ignorecase

Produces this output (in part):

$/abaperls/Perl/ssgrep.bat
   159:             $opt_comments $opt_ignorecase $opt_crossref $opt_debug
   169: ssgrep -config konfig-fil | -VC VC-path [-window n] [-noignorecase]
   176: $opt_ignorecase  = 1;
   181:            "ignorecase!"      => \$opt_ignorecase,
   306:        if ($opt_ignorecase) {
   322:              if ($opt_ignorecase) {

$/abaperls/Perl/tblcnt.bat
    91: $Getopt::Long::ignorecase = 0;

$/abaperls/Perl/tblfix.bat
   175: $Getopt::Long::ignorecase = 0;

That, is you get one section per file, and then SSGREP prints each matching line.

SSGREP prints the output to STDOUT, but unless you expect a very small number of matches, use the > operator direct the output to a file.

When you specify ‑HTML, SSGREP formats the ouput in HTML and highlights the matches to make it easier to read. Here is an example:

$/abaperls/Perl/ssgrep.bat:
   159: $opt_comments $opt_ignorecase $opt_crossref $opt_debug 
   169: ssgrep -config konfig-fil | -VC VC-path [-window n] [-noignorecase]
   176: $opt_ignorecase = 1; 
   181: "ignorecase!" => \$opt_ignorecase, 
   306: if ($opt_ignorecase) { 
   322: if ($opt_ignorecase) { 

$/abaperls/Perl/tblcnt.bat:
    91: $Getopt::Long::ignorecase = 0; 

$/abaperls/Perl/tblfix.bat:
   175: $Getopt::Long::ignorecase = 0; 

There are no links to click, though.

Using ‑window

Normally, SSGREP only prints matching lines, but you can use the ‑window option to get more context. For instance:

ssgrep -VC data6/$/abaperls -window 2 ignorecase

Results in this output (in part):

$/abaperls/Perl/spfix.bat
    97:
    98--> $Getopt::Long::ignorecase = 0;
    99:   my $USAGE = "spfix [-shortnames file] [-upcase] [-cleanup] file1 [file2...]";

$/abaperls/Perl/ssgrep.bat
   158:   use vars qw($opt_config $opt_VC $opt_window $opt_HTML
   159-->             $opt_comments $opt_ignorecase $opt_crossref $opt_debug
   160:               $opt_input

   168:   my $USAGE = <<USAGEEND;
   169--> ssgrep -config konfig-fil | -VC VC-path [-window n] [-noignorecase]
   170:          [-comments] [-HTML] [-crossref]

   175:   $opt_window      = 1,
   176--> $opt_ignorecase  = 1;
   177:   GetOptions("config=s"         => \$opt_config,

That is, beside the matching line, SSGREP also prints the line before and after the match. The line with the match is highlighted. If you specify ‑window 0, SSGREP only prints the name of the matching files, but does not print the lines.

Using ‑crossref

The option ‑crossref rearranges the output, so that there is one section per unique matched string, case-sensitive. Take this example:

ssgrep -VC data6/$/abaperls/sql -crossref aba\w*sysobjects -HTML > test.html

Here is an extract of the output:

abahistsysobjects

$/abaperls/sql/SP/ap_sob_update_sp.sp:

   466: FROM abahistsysobjects 
   470: INSERT abahistsysobjects (objname, loadtime, objtype, isdeletion, subsystem,
   489: INSERT abahistsysobjects (objname, loadtime, objtype, isdeletion, 

$/abaperls/sql/SP/ap_sub_rename_sp.sp:

   102: UPDATE abahistsysobjects
   107: UPDATE abahistsysobjects 
_______________________________________________________________________________________________

abasysobjects

$/abaperls/sql/SP/ap_sob_get_fileversion_sp.sp:

    71: FROM abasysobjects sob 
    79: FROM abasysobjects sob2 

The search pattern matches two different tables, and there is one section per table.

If there is more than one match on the same line, the line is included the sections for both matches.

‑crossref is very practical when you want to see which files that calls a certain set of stored procedures. Not the least does this permit you to see whether a procedure is used at all. Preferrably, you should include the path where the procedure is stored. If you only get a hit for that file, the procedure may not be in use. If you get no hit at all – in which case there is no section – maybe you did not spell the name correctly?

Using a Database for the Output

Note: a better database alternative may be VCDBLOAD which loads the source code into fulltext-indexed SQL Server database.

A third output option is to direct the output to a database. This is useful if you are searching for very many search strings and you are searching very many files. For instance, you could search your entire version-control repository for all your stored procedure to get a complete cross-reference for which you then can search for specifics. A special advantage is that when you use a database as output, you can resume an interrupted search.

To direct output to a database, you specify the ‑database option, and optionally you can also specify ‑Server, ‑User and ‑Password. This applies:

When you specify ‑database, you cannot specify ‑HTML or ‑crossref. (The database format has cross-referencing built-in so to speak.)

Tables Created by SSGREP

SSGREP creates five tables in the specifed database. All tables are created in the ssgrep schema:

ssgrep.settingsThis table saves the search options ‑ignorecase and ‑comments.
ssgrep.patternsThis table saves the search patterns you have specified.
ssgrep.filesThe files that SSGREP has processed.
ssgrep.matchesAll matched strings per file.
ssgrep.matchlinesAll matched lines per file match.

If the tables already exist, SSGREP drops and recreates them, unless you specify the ‑resume option.

Here is the schema for the tables:

CREATE TABLE ssgrep.settings (ignorecase bit NOT NULL,
                              comments   bit NOT NULL)

CREATE TABLE ssgrep.patterns (pattern nvarchar(MAX) NOT NULL)

CREATE TABLE ssgrep.files
       (fileid int IDENTITY PRIMARY KEY,
        name   nvarchar(400) COLLATE Latin1_General_CI_AS NOT NULL UNIQUE,
        regdate datetime NOT NULL DEFAULT getdate())

CREATE TABLE ssgrep.matches
       (fileid int                NOT NULL REFERENCES ssgrep.files,
        matchstring nvarchar(128) COLLATE Latin1_General_BIN2 NOT NULL,
        PRIMARY KEY (fileid, matchstring),
        UNIQUE (matchstring, fileid))

CREATE TABLE ssgrep.matchlines
      (fileid   int            NOT NULL,
       matchstring nvarchar(128) COLLATE Latin1_General_BIN2 NOT NULL,
       linenum  int            NOT NULL,
       hasmatch bit            NOT NULL,
       linetext nvarchar(1024) NOT NULL,
       PRIMARY KEY (fileid, matchstring, linenum),
       UNIQUE (matchstring, fileid, linenum),
       FOREIGN KEY (fileid, matchstring) REFERENCES ssgrep.matches)

The first two are mainly intended to support the ‑resume option (see below), but you can use them to review the settings for the search. Note that in ssgrep.pattern, you will see the resulting regular expressions where SSGREP has join all search expressions with the | operator.

ssgrep.files lists all searched files and defines an id for each file. ssgrep.matches gives you quick references for which files that had which matches. Or which procedures that matched in which files if yoiu like. These two tables should be self-explanatory. Note that the collation for matchstring is binary, so aggregations will be case-sensitive.

ssgrep.matchlines has the actual lines that matched. linenum is the line number and linetext is the contents of that line. hasmatch is 1 if there is a match on that line, and 0 if the line is only a context line. The latter can only occur if you specified ‑window with a value > 1. If you specify ‑window 0, ssgrep.matchlines will be empty. There will still be data in ssgrep.matches.

Resuming a Search

By using the ‑resume option, you can resume a search that was interrupted. ‑resume also permits you supplement a search if you forgot to include some directory in the first round. You can only specify ‑resume with ‑database. When you specify ‑resume, the database must exist, and all five tables in the ssgrep schema must exist.

When you use ‑resume, SSGREP reads the search strings from the database, and thus it is not legal to specify ‑input or search patterns as arguments with ‑resume. SSGREP also reads the settings for ‑noignorecase and ‑comments from the database and ignores what you specified on the command line.

SSGREP checks in ssgrep.files whether a file already has been processed, and in such case this skips the file.

Notes on Security

SSGREP does not set up any permissions on the table or in the database. If you want other users to access the information in the SSGREP tables, you need to set this up yourself.