Kevin Hood
Engineering Manager, SquaredUp
In this article, I’m going to discuss custom logs in Log Analytics. Along with KQL Join and the Let Statement that I discuss in another blog, custom logs is a concept that I struggled to wrap my head around for a long time, as there don’t seem to be very many comprehensive guides out there as of yet. Here is a summary of everything I have managed to piece together from documentation and other people’s blog posts. Hopefully it will help you along your Kusto journey (I know it would’ve helped me)!
If you are new to Kusto, check out our Jumpstart guide to KQL before coming back to this one. You may also find my previous article on Aggregating and Visualizing data with Kusto helpful.
So, the power of Kusto can be brought to bear on your own log data! Fantastic! However, getting the initial setup done for this is a little bit painful in my opinion especially if the log file format isn’t particularly friendly in how it’s formatted. However, once the setup is done it’s largely plain sailing and you can query and visualize data that was previously just a lot of lines of text.
To use Custom Logs you’ll first need a Log Analytics workspace and a computer connected to it. Open the workspace and go to Agent management. Download the appropriate agent from this page and install it on the machine that hosts the logs you’re interested in.
With the agent installed, open its settings from Control Panel -> Microsoft Monitoring Agent. Click the Azure Log Analytics (OMS) tab and enter the Workspace ID and either the Primary or Secondary key shown on the Agents management page of the workspace. You should see a green tick in the Status column after 30 seconds or so. Depending on your environment, you may also need to configure your proxy settings on the Proxy Settings tab.
With the agent set up you can now tell your workspace about your custom log.
In the Azure Portal, click Custom logs in your Log Analytics workspace.
Click Add custom log in the right-hand pane to start the process of adding a new custom log.
You’ll need to give the workspace a bit of a taster of the logs you want to gather, so copy a few lines from an existing log into a text file and use this as the sample file in the wizard that appears after clicking the Add button.
Once this has been done, you’ll need to specify the log collection path(s) for your custom log. This path will apply to all servers that you join to this workspace, and also needs to include the logfile name and file extension. Wildcards are supported in the log collection path.
With the collection path added, give the table that the data will be imported into a name. Custom logs always have _CL added to the end of the table name. Press Done and that’s it; it is all configured.
The configuration you have just created will be sent to the Monitoring Agents on the servers that are connected to the workspace. This is where a little patience is required: it can take an hour or so for connected servers to start sending logs to Log Analytics. If it still hasn’t happened after an hour, you may need to check the log file collection path you’ve configured, the connection your server has to Azure (does it need a proxy to get to the Internet?) and whether new log files have actually been created in the path you specified. It’s also worth checking the Operations Manager event log on the servers that are meant to be sending logs to the workspace; On one occasion I spent a lot of time trying to get some custom logs into my workspace and then eventually found an event that indicated that the Monitoring Agent didn’t like the fact that only TLS 1.2 was enabled on my server!
If all goes well, you should see your table getting populated with records from the custom log you defined. The log data is added into a column called RawData, and from this column you can create your own custom columns.
To do this, expand one of the rows and select the ellipsis. Click “Extract fields from [table name]” to launch the custom fields page.
Use your mouse to highlight the data in the RawData field that you want to split into a new field. In this case I have highlighted the number 50 from the original “Server1,50” in the RawData field. Give your field a name and a datatype. As with the table name, custom fields always have _CF tagged on to the end of their names.
Once you have done this, a sample of matching rows will display on the right-hand side of the page along with some statistics about how many rows matched. If you spot any missing or erroneous matches in the sample, you can click the edit button and select the appropriate option to either modify the match or ignore an incorrect match.
If you select “Modify this highlight” the selected line will appear on the left-hand side of the page. Highlight the correct text to update it. The changes you make to the sample rows will then adjust the matching algorithm for the custom field (hopefully making it more accurate!).
If I’m honest, my experiences with the custom fields functionality have been a bit hit and miss. If your log files are very uniformly delimited then I think it works quite well. However, if the format of the log file is a bit more complex, then it can be a challenge getting custom fields to be 100% accurate. In this case, it might be easier to just split the RawData field as part of your query.
As an example, let’s take a look at the line from a SquaredUp log file that shows a user logging on:
2020-08-10 13:16:31.342 >> 010 >> [INF] DecomposedUsername { Raw: "testdomain\kevin.hood", UsernameOnly: "kevin.hood", Domain: "testdomain", SourceFormat: DownLevelLogon } lease result NewLease yielding Lease { Expiry: 08/10/2020 12:36:31, SessionId: "njmhkfafhxi10uxitrlkt5ev", DebuggingId: dba4585c-21cf-4dee-8836-b3b1e628d544 }
We’ve got tabs, angled brackets, square brackets, curly braces, colons and commas in there which is going to make it challenging for the custom fields functionality in Log Analytics. It’s easier to decide what information we’re after in this instance and then come up with a query that gets its data directly from the RawData field.
Unfortunately, the other log lines of interest are similarly complex but, with a little bit of trial and error and a lot of split() and substring() practice, we can match up log lines with Session ID assigned to a user and get a general idea of what a given user is up to in SquaredUp:
//User-defined function to get unique usernames and browser session IDs from the appropriate log line
let username = SquaredUp2_CL
| extend User = iff(RawData has "DecomposedUsername { Raw: ", split(RawData, "\"")[3], "")
| extend SessionId = case(RawData has "DecomposedUsername { Raw: ", split(RawData, "\"")[7],
RawData has "Parameters:", split(RawData, "\"")[7],
RawData has "<--", split(RawData, "\"")[5],
"")
| where User != "" and SessionId != ""
| distinct User, SessionId;
//Join the list of unique username/session ID combinations with the rest of the log entries using the session ID.
username
| join kind=inner (
SquaredUp2_CL
| extend ThreadId = toint(substring(split(RawData, ">>", 1), 2, 4))
| extend LogLevel = tostring(substring(split(RawData, "[", 1), 2, 3))
//
//There are a few different line formats that include a session ID, so we have to identify the line format and then
//extract the part of that line the contains the session ID.
//
| extend SessionId = case(RawData has "DecomposedUsername { Raw: ", split(RawData, "\"")[7],
RawData has "Parameters:", split(RawData, "\"")[7],
RawData has "<--", split(RawData, "\"")[5],
"")
//
//Extract other useful information like session expiry time, HTTP method and the endpoint being called
//and create new columns for this data using "extend"
//
| extend SessionExpiry = iff(RawData contains "DecomposedUsername { Raw: ", substring(split(RawData, "Expiry:")[1], 1, 19), "")
| extend EndpointCallExit = iff(RawData has "<--", strcat(split(RawData, "\"")[1], "::", split(RawData, "\"")[3]), "")
| extend EndpointCallEntry = iff(RawData has "Parameters:", strcat(split(RawData, "\"")[3], "::", split(RawData, "\"")[5]), "")
| extend HttpMethod = iff(RawData has "[DBG]" and RawData has "::" and RawData !has "Session" and RawData !has "<--", split(RawData, "\"")[1], "")
)
on SessionId
| project TimeGenerated, LogLevel, ThreadId, User, SessionId, SessionExpiry, HttpMethod, EndpointCallEntry, EndpointCallExit
| order by TimeGenerated asc
Running this query on some SquaredUp log entries that have been uploaded to my workspace gives me the results below. I can see a variety of information matched to a user and a browser session ID, some of which tells me what action they were performing in SquaredUp.
We can then use this query as the basis for more specific queries that we can visualize. So, for example, we could use SquaredUp to visualize how extensively each user has used SquaredUp over the last 24 hours by counting the number of browser calls to any SquaredUp server API endpoint:
let username = SquaredUp2_CL
| extend User = iff(RawData has "DecomposedUsername { Raw: ", split(RawData, "\"")[3], "")
| extend SessionId = case(RawData has "DecomposedUsername { Raw: ", split(RawData, "\"")[7],
RawData has "Parameters:", split(RawData, "\"")[7],
RawData has "<--", split(RawData, "\"")[5],
"")
| where User != "" and SessionId != ""
| distinct User, SessionId;
//Join the list of unique username/session ID combinations with the rest of the log entries using the session ID.
username
| join (
SquaredUp2_CL
| extend SessionId = case(RawData has "DecomposedUsername { Raw: ", split(RawData, "\"")[7],
RawData has "Parameters:", split(RawData, "\"")[7],
RawData has "<--", split(RawData, "\"")[5],
"")
| extend EndpointCallEntry = iff(RawData has "Parameters:", strcat(split(RawData, "\"")[3], "::", split(RawData, "\"")[5]), "")
)
on SessionId
| summarize count() by User
This is a bit of a blunt instrument to measure SquaredUp activity as different actions are going to trigger a different number of API calls but hopefully it demonstrates that you can derive and visualize useful data, even from complex log files and without creating custom log tables and custom fields.
If you need to visualize Logs data, perhaps alongside other data from REST APIs, Microsoft SQL data, PowerShell, ElasticSearch and other data sources, then you can do this with our Dashboard Server product.
Dashboard Server is platform-independent and free of charge!
There are a few steps needed to provide it with access to your KQL-queryable data in Azure; you need to create an Azure AD Application to give Dashboard Server API access to, for example, Log Analytics data and then create a new Log Analytics Provider in Dashboard Server to access Log Analytics workspace data via this application. Once the provider is in place, then all that is needed is for you to write some KQL to return the data you’re interested in, and Dashboard Server will do the rest!
That’s all I’ve got for this article, but with any luck it has been helpful to you. Be sure to check out the rest of our Kusto and Azure Monitor content that I’ve linked below. Thanks for making it all the way to the end!
Resources:
For more advanced things you can do in Kusto:
To see the difference SquaredUp can make to your visualizations in Azure:
To learn all you need to know about Azure Monitor:
Share this article to LinkedIn to spread the knowledge: