[Text DataSources] Regular Expression

This recipe demonstrates how to read data from LOG files using text datasource.
Steps taken:

Create a text datasource (Rep7-Server-Log.ds). Enter the file path of server.log into the “Define Text Datasource > File URL” field. Set Encoding to “UTF-8”. Set Date, Time and Timestamp formats properly. Set the “Access Type” to “Regular Expression”.

On the next page, enter the following into the “Regular Expression” field:

(.),(.),(.),(.)

Infer schema and click “Finish”.

Create a text datasource (Repertoire_ServerLog_ExtractInfo.ds). Enter the file path of server-2007-04-25.log into the “Define Text Datasource > File URL” field. Set Encoding to “ASCII”. Set Date, Time and Timestamp formats properly. Set the “Access Type” to “Regular Expression”. Select the “First line is header” checkbox.

On the next page, enter the following into the “Regular Expression” field:

(.),(.),(.), ERS2.Audit - [byte-size=(.), job-ended=(.), job-owner=(.), job-received=(.), job-started=(.), mime-type=(.), name=(.), page-count=(.), record-count=(.), status-code=(.*)]

Infer schema and click “Finish”.

To download the necessary files for this recipe, refer to the attached ZIP file.
RegEx.zip (77.4 KB)