My security team wants the information gathered by session recording to be parseable in a SIEM. How do I achieve this?
There are several things you can do to get additional information. You can do some or all of these:
Enable enhanced session recording: https://goteleport.com/docs/server-access/guides/bpf-session-recording/
This will capture not only the screen contents of an SSH session, but it will also record what is happening under the hood. It will note which commands are launched and when. This additional information is sent to the teleport audit event log, so you'll need to make sure that whatever SIEM system you use is configured to receive those events. The session ID for each event is noted as part of the event output so it can be correlated.
Solution 2: Ingest the session recording files in json format
Each session recording that is created gets stored by the auth service. Usually these are supposed to go to a blob storage system, but they can be stored on a regular filesystem too. While they have a .tar extension, they are not actually a tar file. They are stored in a protobuf format that has basic metadata about the session, what text appears on the screen when, and little else. Even with enhanced session recording on, these recordings contain only the playback info and basic metadata.
To dump this metadata to a more machine-friendly format, you can use the `tsh play` command. You don't need to have an active tsh session in order to dump the json metadata from a .tar file that you have locally.
tsh play -f json eb593b1d-15f0-4c1e-b91e-b2f8f7738886.tar > eb593b1d-15f0-4c1e-b91e-b2f8f7738886.json
You can configure your SIEM to do the translation for you, or you could have a script that watches for new session recordings and do the conversion, outputting the json to a place that the SIEM can read.
Solution 3: dump the session playback raw and parse the terminal output with your SIEM
This is a tricky solution since so many programs and other strange things can pop up in an ssh session. It is likely to be moderately error prone, but it may be helpful to collect and try to parse the information that is contained in the recordings.
tsh play -f pty eb593b1d-15f0-4c1e-b91e-b2f8f7738886.tar > eb593b1d-15f0-4c1e-b91e-b2f8f7738886.raw
This .raw file has no real timing information in it (other than the timestamp that teleport puts at the top right corner)-- it's simply a bytestream of every character sent to the interactive session terminal during the session. This includes control characters for erasing things, colors, etc. You can remove certain control characters with the `col` program. For example, removing any "backspace" type control characters can be done as follows:
cat eb593b1d-15f0-4c1e-b91e-b2f8f7738886.raw | col -b
It produces a semi readable output of the bytestream since it never erases anything that appears on the screen. It may be suitable for some usecases, but full screen TUI programs like vim or curses applications appear very muddled. It is still possible to match certain strings or keywords in the output which may suit some usecases.