The Importance of Grammar in Forensic Linguistics


Commas matter and grammar matters. Especially when you deal with threat letters, poison pen letters or even ransom notes. In this case, grammatical errors, misspellings or unique writing styles might reveal the person behind the mischievous texts. Are you dealing with one author or multiple individuals? Can you link these letters to other reference documents, e.g. internal employee emails? The art of analyzing written documents in investigations is a subset of forensic linguistics.

While I won’t go through any real examples in the following article, I would like to share my experience when dealing with such cases. First up, I won’t even try to get into graphology. This is the analysis of handwriting, in an attempt to evaluate personal characteristics or the psychological state of the writer.

Graphologist: “The author is a male and he is very angry, possibly holding a grudge against the recipient.”

Intel analyst: “No shit, sherlock. The handwriting is sloppy and why else would he write a poison pen letter?”

The most important tool for me is a set of highlighters. If dealing with multiple documents, I found it easer to print them out and to mark peculiarities with the highlighters and also add handwritten notes of my own. I use different colors for different categories. One for spelling mistakes, one for grammatical errors, one for the use of uncommon words or unique word-creations and lastly the final color for certain style elements.

Let’s start off with the first category: spelling mistakes. Many people have distinct spelling mistakes they constantly make. And not always will they recognize mistakes when proof-reading their own work. Sometimes these mistakes might also indicate if the author is a native speaker or not. A German writing an English text might automatically use Telefon instead of telephone. Furthermore, many languages capitalize nouns, so look out for this as well. Other spelling mistakes may derive from auto-correct functions in office. When I open Word, it assumes I’ll write in German and does the autocorrect based on the German dictionary. Newer versions of Word notice I’m typing English after about one sentence and then automatically adjust, older versions might need a manual reset. When typing or writing quickly, one may produce clerical errors, such as forgetting letters, adding letters or switching letters. In this case, always check to see how letters are allocated on the keyboard to understand the origin of these typos. Keep in mind that different countries use different keyboard-layouts!

The next one is a bit more tricky. I have to admit, my grammar isn’t the best. I usually just know that something looks weird, without being able to grasp the actual reason or grammar rule. So, in this phase of investigations I often google certain grammar rules to make sure my hunch was right. From simple things such as mixing up your and you’re, to the inproper use of commas, there are many different errors that might show up in multiple documents. One important thing to remember is, that it will be the sum of indicators that lead to successfully solving the case. It most likely won’t just be one blatant error.

Depending on an individual’s background, they may have a different spelling of words. It may vary between British or American English, it may contain colloquial terms or even slang and different dialects. Everything that differs from the standard form of writing in the specific area you are working in, should be marked with a highlighter. Using modern-day slang might indicate a younger person, old-fashioned terms will probably not be used by a kid. I once had a case, in which the author creatively invented new curse words I had never heard of before. Some of them where so hilarious, I actually added them to my personal vocabulary. Another example would be the use of local dialect. In Germany bread rolls are named differently in many regions: Brötchen, Wecken, Semmel, Schrippe, Krossen, Normale, Rundstücke; these are all the same thing! I’m sure similar examples can found in other languages and for more relevant terms as well. Try to figure out which region the word originates from. Again, a little googling can be helpful here.

Next up, concentrate on the style of writing. Is there anything that sticks out? Specific punctuation, such as the frequent use of exclamation marks or multiple dots…. Also, concentrate on the sentence structure. Is the author using short sentences or is he fond of long-winding sentences? Does the whole document read as if it were written by the same person? A shift in style may indicate that some part was copied from another document. Finally, have a look at the format: font, size, line spacing, alignment. After marking all documents according to the above points, it’s time to spread them out and get a birds-eye view of all of them. Sometimes, this will reveal more similarities or conspicuous features shared by multiple documents.

No, for the most important aspect: Assume your adversary, the author, is well aware of his distinct mistakes and style of writing! He might try to deceive us. Chaning the usual format, by using odd fonts or changing the alignment are easy to recognize, but sometimes an author will substitute some of his unique identifiers with another. Mostly by doing the exact opposite of what his style of writing is usually known for. Someone that uses long and complex sentences might break these down into short and concise sentences, making the letter look more like an old telegram. Obvious spelling mistakes might be implemented as well, to put us on the wrong track. However, anything that is deliberately done will likely follow a certain pattern. It is our job to identify this pattern.

Of course, there is much more that can be done when handling cases like these. Analyzing handwriting by overlaying different sets of handwritten words on each other is one technique that might help. This works really well in MS Office, since the office suite has some pretty impressive features to handle images. Furthermore, fingerprint identification (dactyloscopy), analyzing the paper, trying to trace back emails; a broad variety of methods can be applied here. Maybe even the graphologist, if you’re that desperate. As with all intelligence analysis, it is important to never fully rely on just one method. Combine what you have at hand to achieve the best result.

After this brief introduction to the topic of forensic linguistics, I will prepare an example for a future article, highlighting the aforementioned. I just have to figure out who I want to blackmail or send a poison pen letter to. Maybe one of the scammers from a previous project.

Matthias Wilson / 01.10.2019

OSINT Key Findings in the Year 2009

Syria, nonproliferation sanctions, OSINT, Google Dorks and SIGINT. In 2009, these all came together in an interesting investigation.

Earlier this year, I wrote an article about my opinion on the future of OSINT and while doing so, I had to think about how OSINT looked in the past and how it has evolved over the years. Gathering and analyzing information, not only through OSINT, has always been my passion and I’ve been doing this for about 20 years now. Just like the recent project with Sector035, where we unraveled a massive scam network, I have often conducted research on specific topics purely out of curiosity. These side projects were never work related, but the skills I then learned were eventually useful throughout my career. Often, reading a simple news article would send me down a rabbit hole. From looking up related news articles to spending hours on Wikipedia to creating link charts, largescale investigations were always only a mouse-click away.

I just recently recalled a project I worked on in early 2009. It all started with me looking into various nonproliferation sanctions lists. I think it was a news article that sparked my interest. These sanctions were and are imposed on countries that have been accused of trying to procure and/or produce weapons of mass destruction, e.g. nuclear, chemical or biological weapons. I started looking into government and non-government entities from Syria on those lists. Remember, this was back in 2009. There weren’t really many sophisticated OSINT tools back then, so most findings resulted from simple Google queries.

One of the entities I looked at was the Mechanical Construction Factory. Googling this led to millions of results, so I narrowed it down by adding quotation marks: “Mechanical Construction Factory”. My next step was looking for this search term in specific filetypes. PDF or Powerpoint documents have the tendency to contain more relevant information than your average webpage. Adding the filetype-operator in Google led to some rather interesting results.

For example, the Greek Exporters Association (SEVE) posted monthly spreadsheets of tenders originating from Syria. These lists contained information on who requested the offer (including addresses, phone numbers and email-addresses), as well as goods they were seeking to acquire.


In order to find all tender spreadsheets on this page, I again used Google dorks. Combining the site-operator with the filetype-operator brought up all the PDFs saved in the 2008 directory. Since I only wanted to look at the PDFs for Syria, I used Google Translate to obtain the Greek spelling of Syria, as each spreadsheet had this somewhere in the document. The final query looked like this:


I now had a long list of Syrian companies that had requested to purchase goods from Greece. Not only that, multiple companies used the same phone numbers, so I could assume that they were linked to each other in some way. I recall finding one or two companies that were linked to a sanctioned company by a phone number and that weren’t listed themselves.

Playing around with Google dorks had me find plenty of interesting material to go through. While I can still reproduce the example mentioned above (just try it yourself), the most interesting finding in this case is unfortunately lost.

Back then, Turkey had a government organization named “Undersecretariat for Defence Industries”. The Turkish abbreviation of this was SSM. The SSM-website doesn’t exist anymore, as the organization was renamed and restructured in 2018 (as SSB). This organization posted roughly 150 scanned original tenders from Syria on their website. While not directly accessible through a dedicated page, using the Google dorks had them appear in my queries. These documents contained phone numbers, addresses, signatures and seals that were stamped on the paper. Apparently, they were sent to Turkey in hardcopy or scanned and then sent electronically.

Keep in mind, I did all this at home. This was my hobby and not related to my actual line of work. I was a SIGINTer, not an OSINTer at work, tasked with a completely different area of operations. However, these original documents seemed like something my colleagues working on Syria would also be interested in. I took an example of one of the tender documents to work one day and showed it to the guys at the Syria desk. They could not believe that I had just found this online. Some of them where even convinced that I had access to their data and pulled it from there. I ended up directing them to all the documents I had discovered on the aforementioned Turkish site and they proved to compliment the knowledge the Syria desk already had.

While writing this article, I tried to find the those documents using the Wayback Machine, but as I previously mentioned they weren’t actually located on a site that could be easily accessed. So, they unfortunately weren’t archived. I went through the complete site map in the Wayback Machine with no luck. For those of you who don’t know this function, try it out. It is great to get an overview of the structure of a historic webpage.


In 2009, many people underestimated the power of OSINT. In 2019, I don’t think many people will make that mistake again. No fancy tools were needed back then, just some Google dorks and perseverance to manually go through hundreds of PDFs. Although things have changed in the OSINT world and continue to change as we move along, I am sure there is still plenty of juicy information that can be found on the internet by just mastering the use of Google operators. Happy hunting, fellow OSINTers!

Matthias Wilson / 27.09.2019

Social media is dead, long live social media!

Is your intelligence target under 25 and not on Facebook? You might want to check the social media that kids nowadays are actually using!

My daughter always says: “Dad, Facebook is for old people!” It’s true, I’ve noticed that many people under the age of 25 aren’t on ‘traditional’ social media anymore. They are not on Facebook and they may give a confused look if confront them MySpace, GooglePlus or walkmans.

So, how and where do you find Generation Z on social media. Clearly, they still feel the urge to express themselves on the internet and they’re still out there, but mostly not with their real names. This makes OSINT much more challenging. On Facebook we could search for real names, we could search by phone number and in some cases we could find people through email addresses. Some of these techniques work on other social media platforms, some won’t. In any case, if you find a profile linked to one of your targets, you might come across further social media profiles that your intelligence target has backlinked on the one you have found.

I’ve noticed that many young people use TikTok, an app designer to share short music videos. It contains likes, friends and comments, similar to what we know from ‘traditional’ social media. Luckily, the TikTok app allows you to find profiles linked to phone numbers. For this, you need to install the app either on your burner phone or in an AndroidVM, then go to the profile page and tap the ‘add contact’ button on the top left. The red dot indicates that new contacts have been found.


Next up, choose the option in the middle, stating that would like to find contacts from your phone book. This of course means you have to add the phone numbers of your intelligence targets to the phone book first and give TikTok access to it.


Tapping ‘find contacts’ will show the amount of phone numbers that are linked to  TikTok accounts and it also gives you the choice to follow them. It looks like some of my contacts are actually using TikTok.


If you have a nickname, even one derived from other platforms, these can be looked up in the app itself too. TikTok will only allow you to search for the beginning of the nickname and not for parts in the middle or last portion of the name. In the following screenshot I looked for nicknames containing ‘James’ and I was only shown names starting with ‘James’. The reason this is relevant, is that I have often found TikTok accounts to use prefixes or suffixes on their regular nicknames. So instead of just ‘James’, you might find the user as ‘xyz.james’ or ‘james.1982’.                                       4.png

However, there is a workaround for this. Just like with Instagram, there are many sites that scrape TikTok and display the accounts and in many cases the content as well. One of the ones I like to use is PlayTik. PlayTik allows you to search for hashtags and accounts. Let’s find an account that somehow uses ‘f1nd1ng’ in the nickname.


There we go, two accounts containing the searchterm. Now you can have a look at the profile and check out any videos this profile has uploaded (and publically disclosed). It looks like this particular profile also links to further social media and websites, like I had mentioned before. Plus, the profile contains a video. Feel free to watch it!


Facebook may be fading (soon), but others platforms will replace it. Thus: Social media is dead, long live social media! The new platforms are not just for young people, so go and try them out (research them) yourselves!

Matthias Wilson / 13.09.2019


Unravelling the Norton Scam – Final Chapter

Gotcha! We found out who is responsible for this massive scam. Using OSINT and social engineering we tracked down the company behind the Norton Scam.

Chapter 1 – It all starts with a bad sock puppet

Chapter 2 – The Art of OSINT

Chapter 3 – What’s the big deal? And who’s to blame?

Chapter 4 – The more, the better

Chapter 5 – Mistakes on social media

Chapter 6 – Tracing ownership

Final Chapter – Putting the pieces together

Time to finally unravel the Norton scam. Sector and I have decided to conclude our investigations and put the pieces together, after spending countless hours working on this case. Every time we thought we had figured it out, new information was found, taking us down another rabbit hole. Sometimes we spent days following a lead, just to find out that it wasn’t related to our case at all. As with most investigations, we were not able to solve all mysteries, but we are pretty sure we identified the company and some individuals behind this massive scam scheme.

In the last chapter, we pointed out how everything led to specific Indian phone number (+91.9540878969). This number was used to register many of the domains we were looking into. Once more, I decided to make some phone calls to India. I found out that the number belongs to a web design office. The first four phone calls were answered by different men who did not understand English, so they hung up on me. My fifth phone call was more successful. I got a hold of a woman named Priya and told her that a friend of mine had recommend them and that I was looking to have a website set up for me. I had called the right place and I would need to speak to her boss, Priya explained. I also mentioned that the site was to be used as a scam site to obtain credit card data. This too was possible according to Priya. Soon afterwards I had a conversation with the boss, who remained nameless. If I was willing to pay roughly 150$ on PayPal, they would set up the site I needed. With these phone calls, we have proven that the web design office was responsible for setting up the type of scam sites that we have seen throughout our investigations.


During our research, we also came across a site which offered web design services to US customers and to which we had actually found legit websites they had created. This is something very common: using a US frontend to sell IT-services that are performed in India. So, not everything the team did was illegal or scam-related.


In order to promote the scam sites, another team was responsible for search engine optimization (SEO). The SEO team was most likely also located in the offices of the web design team, probably under the same leadership. Their job is to flood the internet with backlinks in order to promote the scam sites. So far, we have found more than 20,000 entries for this cause. From Facebook posts, to Medium blogs, to comments on non-related webpages; a large variety of backlinks were created in the past year.


As mentioned in chapter 3, the purpose of the scam is have the victims call one of the tech support phone numbers. Thus, a team of call center agents is required. Remember how the scam works? If an unsuspecting victim calls the number, they provide ‘assistance’ by obtaining remote access to the victim’s computer. In some cases malware is installed, in other cases they ask for credit card data in order to bill the customers for their service.


These call center agents were hired by a company named 4compserv, which is located at an address that was also used to register some of the identified scam domains. We suspect this is root of all evil, the company behind the scheme. Or at least some employees of the company, since we have also found evidence of 4compserv conducts legal business as well.


More evidence came up, which proves that the web design office and the call center are definitely related. Shortly after I had spoken to the boss of the web design office, I received a phone call from the number linked to the call center (+91.97117613). Unfortunately, I missed the call and haven’t been able to reach them ever since. Furthermore, one of the scammers I had personally texted with recently updated the CV on his website. Have a look at his current jobs:


While there are still some questions to be answered, our research has enabled us to have an overall understanding of the network and the techniques used to run their scam, as well as identifying the company most likely behind this scam: 4compserv in Noida, India.


Along the way, we would often stumble upon funny facts. Some of the scam developers were just so sanguine, they didn’t want to obscure their tracks. Such as the preferred use of the name ‘Nancy Wilson’ to register domains or create sock puppets. The original websites the scammers had set up were very crude, now it seems they are using nice looking WordPress templates, including chatbots. Usually, the chatbot would ask for a phone number, so the scammers can call back. And guess who you would be chatting with on all of these sites? Good ol’ Nancy!


We’re done! We managed to find the perpetrators behind all this. What started with a sock puppet on Medium led to unravelling a largescale scam network, targeting unsuspecting victims seeking tech support. We hope that our project may help counter the threat originating from this specific scam and raise awareness for similar schemes. Also, thanks to many of our readers for sharing the posts from this series on Twitter and LinkedIn, ultimetely ranking the articles higher and higher on Google. Using OSINT and social engineering to enable counter-SEO against the scammer’s massive SEO effort!

Now it’s time to relax a bit…before we start the next awesome project!

Sector035/Matthias Wilson – 25.08.2019

We explicitly decided to keep the disclosure of personal information on the investigated individuals to a minimum in these blog posts. However, the complete information gathered is available to law enforcement and/or the companies targeted by this scam upon request.

Unravelling the Norton Scam – Chapter 1

If you have problems with Norton 360 or Norton Antivirus, please do not call +1-844-947-4746. You might end up with malware on your computer.

This is the start of a series of blog posts revolving around a massive scam network that targets individuals looking for tech-support regarding various software products. The scam mostly starts with fake Norton 360 and Norton Antivirus sites, however, has also been linked to fake Microsoft support sites and fake Facebook support sites (just to mention a few). We dug into this network, trying to identify the perpetrators behind it and used lots of different OSINT techniques over the course of several months. Every once in while a little social engineering came in handy, as we also contacted some of the suspected perpetrators directly. Our investigations are not over yet, there is still more to be found, but let us take you along this fascinating journey of online investigations.

Chapter 1 – It all starts with a bad sock puppet

Do you have a look at the accounts that connect with you on Twitter or Medium? I do, and so does my buddy Sector035. In late April 2019, a new person followed Sector’s blog on Medium and he had a look at this new follower.


A weird URL? A nice picture of a female named Pierre? This profile was begging for further research. The URL led to a tech-support site that listed the following phone number: +1-844-947-4746. Sector didn’t even wait to check this number on his computer and immediately googled it on his cell phone. I guess that’s what you call OSINT curious.


It turns out that this phone number was listed on numerous obviously fake sites and blog posts offering tech-support. Out of curiosity, we decided to take a closer look at some of the sites, in order to see how they were connected to each other and possibly find out who was responsible for creating them. At the time we had no idea how time consuming and big this project would be! Among the sites using the phone number, we initially concentrated on these four:


Each site looked worse than the other. Horrible design, bad English and next to the aforementioned phone number, they all used the same address:


While Sector started to check the WHOIS information using DomainBigData, GoDaddy and Whoxy, I looked into to Google Street View and did a little reverse image searching on the photos. It turns out that all the photos used were either stock pictures or stolen off other people’s social media profiles and the address itself was in an inconspicuous housing area. Googling the address led us to more suspicious sites, some of them using a different phone number. Among these was one belonging to a company allegedly called Energetics Squad LLC. No records existed for such a company in the State of Illinois, nor in any other state. Keep this company in mind, as it will show up in a later blog post as well!

The WHOIS check didn’t always provide the exact name of the registrant, but we found another similarity: most of the websites had been registered around March 13-14, 2019 in India.


Using DNSLytics, Sector also checked the Google Analytics ID and found that the sites were not only linked by all of what was described above, they also shared a common tracking code (UA-code). At this point, it was time to start linking the information in Maltego.


What started with a bad sock puppet, led to googling information and from there to a deep dive into domain data, Google Analytics research, as well as pulling corporate records from official state registries. The hunt was on and upon finding all this correlating data, we couldn’t just let go and decided to push forward.

Soon after, we started collecting information on an actual suspect and at a certain point engaged in an interesting conversation with this person. So, stay tuned for the next chapters of our fascinating journey!

Sector035/Matthias Wilson – 31.07.2019

Вы понимаете? OSINT in Foreign Languages

It just takes one click in OSINT to land on a website in a foreign language. Investigations don’t have to stop here, if you have the right tools.

In today’s interconnected world, OSINT investigations lead us to foreign language content quite often. This does not mean we have to stop here. Thankfully, a broad variety of tools can support us in translating the content we find.

Before getting into specific tools, I have learned that you will receive the best results if you define the input language manually. Most tools can autodetect the input language, but if you’re working with short sentences or even single words, this might not function reliably. Sometimes translating very long sentences will also produce awkward results, splitting a long sentence into components could help in this case. That said, let’s have a look at some tools I use during my investigations.

First off, I would like to point out DeepL, a German company that trains AI to understand and translate texts. When it comes to translating content in German, English, Spanish, Portuguese, Italian, Dutch, Russian and Polish, DeepL has proven to be more accurate than other tools. You can copy and paste a text or upload a document to have it translated. I let the platform have a try at an excerpt from one of the older Keyfindings’ posts in German.


The next must-have is Google Translate. This extension should be installed in any browser to easily decipher pages on the fly. Next to translating complete webpages, it will show you the original text by hovering the mouse over that passage. In some cases this can be helpful, especially when Google tries to translate names of people, places or companies as well.


What if neither DeepL or the Google Translate extension work? Maybe you’re on a page that does not use the Latin alphabet, e.g. Chinese or Arabic, and some of the content is not ASCII-coded. This happens quite often when looking at Asian websites. Another case might be handwritten information in such languages. One of my favorite tools for this is on the Google Translate website itself. Next to the obvious copying and pasting of text, as well as uploading documents, Google allows you to use a foreign language virtual keyboards to input information.


However, this isn’t always helpful. In Arabic, letters vary in shape depending on their position in the word. This makes it hard for someone not proficient in Arabic to use the keyboard. Luckily, there is a workaround!

The Google Translate page allows you to draw what you see and based on that it will make suggestions and translate them. This works really well with any character-based writing, such as Chinese, Korean and Japanese, as well as with other languages that don’t use the Latin alphabet (Russian, Hindi, etc.). I have added a quick video to demonstrate how it works.

As an alternative, I looked into Windows Ink on the Microsoft Translator, but Microsoft currently doesn’t offer an Arabic handwriting package. However, it does offer Russian, Chinese, Hindi and several others character-based alphabets and languages.

When trying to translate subtitles in Videos, there is a workaround that was shared by Hugo Kamaan on Twitter, showing how you can use your cell phone camera to receive instant translations.

There are definitely more tools out there, so feel free to add anything you use frequently or that you think is missing in the comments.

Я надеюсь, что это было полезно для вашего расследования OSINT!

Matthias Wilson / 21.07.2019

How GDPR affects OSINT

The introduction of GDPR was a shock to many. While there are limitations, it doesn’t prohibit OSINT work completely. Find out what you can or cannot do when conducting investigations.


In almost all OSINT activities, we process (e.g. collect, store, analyse, reproduce) personal data including names, addresses, user names, phone numbers, IP addresses and much more. However, the new data protection legislation introduced in the European Union in May 2018, the General Data Protection Regulation (GDPR), restricts the processing of personal data. Therefore, OSINT researchers need to have good understanding of how the GDPR applies to their situations, if only to stay on the legal side with their work.

In this blog post, we will discuss GDPR basics for OSINT researchers. We will not look at the exceptions. Processing personal data for household use or for journalistic purposes are for the most part exempted under the GDPR. Of course, the devil is in the details with respect to when these exemptions actually apply and this blog post will not go into those cases. Also, we will not look at OSINT for law enforcement use, as that has a different legal framework for dealing with personal data.

Hence, we will aim at OSINT in a commercial setting, where the researcher is dealing with matters such as background investigations, third party assessments and pre-employment screenings. Furthermore, this article will only discuss those GDPR aspects which are most relevant for OSINT work as we see it now. The GDPR is an extensive regulation with numerous aspects, which we cannot fully discuss in a single blog post.

Also please keep in mind: We are not lawyers and the implementation of the GDPR can differ between EU countries in numerous details. If you are reading this to seek legal advice, you should consult your friendly lawyer instead.

Summarising the GDPR, the aspects relevant for OSINT work are:

  1. You need a legal basis for processing personal data;
  2. You need to apply certain principles in the processing of personal data;
  3. The data subject of whom you process personal data has specific rights you need to understand, anticipate and honour.
  4. Understand if you are the data controller or the data processor.

Legal basis

Data protection regulation was never meant to render the processing of personal data impossible. Instead it is meant to balance the need for data exchange in our society on the one hand, and the fundamental right of privacy for citizens on the other. It is  therefore important to note that privacy is not an absolute right. The GDPR balances this right with other rights by restricting the processing of personal data to instances where there is a legal ground (Article 6). It lists six different grounds, of which three are potentially relevant for OSINT work: consent (Article 6a), legal obligation (Article 6c) and legitimate interest (Article 6f). We will discuss these three in detail.

Article 7 of the GDPR sums up the conditions for consent. Consent is tricky because the GDPR states that the data subject should have free choice when giving consent. However, for example, free choice in an employer/employee relationship does not really exist. Consent can – according to the GDPR – also be withdrawn at any time. So what happens when a data subject withdraws his or her consent halfway into an investigation? Or when it turns out afterwards, that the consent cannot be regarded as given freely?

Due to this ambiguous legal nature of consent under the GDPR, we believe that for OSINT investigations the use of consent should be avoided whenever possible. The risk that the data subject could afterwards argue that he or she had no choice than to give consent because of the consequences at risk, is simply too large. Moreover, often it is practically not possible to obtain consent especially if you, for example, are examining social media of the circles around your subject.

The second potential ground for processing personal data, that may be relevant for OSINT work, is legal obligation. This could be the case when your client has the obligation to identify their customers and the source of their funds under Anti Money Laundering (AML) regulations. Especially if you are instructed by a financial institution, this may likely be the overall legal basis for your OSINT work.

The third ground for processing personal data that may be relevant for OSINT work is a legitimate interest of your client.

What is legitimate interest? This is a tricky question, as it does not directly relate to any other law or regulation. Imagine the following scenario: A large stock-listed fashion company is in the process of hiring a new CEO. A final candidate is presented and the company puts him through a pre-employment screening. The company is known for their strong stance against animal cruelty and has supported many awareness campaigns in this regard. It has become their corporate identity. During the pre-employment screening pictures are found on social media, showing the CEO-candidate participating in annual fox hunts. If this information would leak when the CEO is already in position, it would surely cause a major scandal, possibly decreasing the stock value of the company and causing job layoffs.

The scenario described above could be a legitimate interest, as the financial situation of the company and thus the prosperity of many employees could be affected be the actions of one person. Nonetheless, pre-employment screenings are frowned upon in certain EU-countries.

Another example can be a simple fraud investigation, where you are tasked with identifying possible assets belonging to a person suspected of fraudulent actions. Your client claims to be defrauded and would like to initiate legal action. As such, the client has a legitimate interest to instruct you identifying possible assets of the perpetrator.

In sum, as an OSINT investigator, you should always understand and document the legal basis to process personal data before conducting your research. In a commercial setting, that will most often be either be a legal obligation or a legitimate interest of your client. We advise to always properly document the legal basis, for example, by explicitly detailing the situation in an engagement letter or contract for the work. More in general, and from an ethical point of view, we believe that you should always want to understand what the purpose of your work is and the interest of the client.


Regardless the exact legal basis, Article 5 of the GDPR imposes a number of principles for the actual processing of personal data. Each of these are relevant for OSINT work and therefore we will discuss them all.

  1. Lawfulness, fairness and transparency

Lawfulness is self-explanatory: You need a legal basis to process the personal data. This should be one of the six legal bases as provided in Article 6. Furthermore, you should not hack, steal or lie to get the data and to prove this, you need to document the sources – which a professional OSINT researcher would do anyway. Not only are these actions unfair and unethical, they may as well be illegal in many cases. Fairness also relates to proportionality. Is the amount of personal data you collect on the data subject proportional to the task at hand? Transparency is touched upon in more detail in Article 14 of the GDPR (situations where the data is not collected from the data subject), we will discuss this further along.

  1. Purpose limitation

Again, pretty easy. What you collect for one purpose, shouldn’t be used for another incompatible purpose.

  1. Data minimization

You should minimise the amount of personal data to what you really need. As much as necessary, as little as possible. So when you scrape gigabytes of data, only the information relevant to your task should be retained.

  1. Accuracy

You have the obligation to make sure that data is of good quality, thus do not use outdated information or data of which you know it is incorrect. Especially when working with people search engines, you will stumble upon a lot of old or false info (false flags) on individuals. You are responsible for verifying the data where possible before further processing or reporting.

  1. Storage limitation

How long do you retain your records? The GDPR states that it should not be longer than needed. In an ongoing litigation that could be years, but often not more that general data retention obligations such as in civil or tax law should be followed. Again, this can differ per country.

  1. Integrity and confidentiality

You are responsible for keeping the personal data you process secure. The fact that you may have collected it from open sources is completely irrelevant, it shouldn’t be publically disclosed from your side. The adagio here is: If you cannot protect it, don’t collect it.

  1. Accountability

The basically means that when processing data you should not only adhere to the previous principles, you should also be able to demonstrate afterwards that you did. The GDPR has shifted the burden of proof to those processing personal data, so you need to document!

A most efficient solution to document on how you comply with the GDPR is to draft an investigation protocol in which you describe how you process personal data and how you apply these principles to your work. In any assignment or when you get questions from the Data Protection Authority (DPA), you can refer to the protocol.

Data subject rights

The third category of GDPR provisions relevant for OSINT, is the data subject rights. According to GDPR, a data subject has rights of:

  • Notification

Every data subject shouldbe notified which personal data is processed by whom, why and where. GDPR Article 13 and 14 state that no matter if collected from the subject itself or without knowledge of the data subject, they must be informed in order to be able to contest this data collection and processing. Again, this proves to be tricky, especially when conducting investigations against a certain subject.

In some cases, the notification might be disproportional in regards to effort that has to be made in order to inform the data subject.

  • Every data subject has the right of access, right to rectification, right to erasure, right to restriction

If informed, a data subject has the right of access to all data stored on him- or herself, the right to restrict further dissemination (if not necessary by law), and last but not least the right to have all data deleted. Of course, this must be weighed against the legal basis or legitimate interests upon which the investigations took place.

There is also an exemption on these rights possible under article 23 of the GDPR which states that Member States can limit the obligations and rights under the GDPR in certain instances. This has to be done by law and every Member State may have implemented this provision differently. The Netherlands have Article 41 of their national implementing law which fully integrates Article 23 GDPR.  Does your country have the same?

Of course, if you would use an exemption, you need to document the circumstances and considerations on why you think that this exemption is justified. Be careful with this and understand the local implementation of the (conditions for) exemptions before you apply them.

Are you the controller?

A final important point relevant for OSINT work is whether you are the data ‘controller’ (the one who determines the purposes and means of the processing of the personal data) or the data ‘processor’ (the one who processes personal data on behalf of a controller).

The easiest situation is where you are the data processor and you process data under responsibility of the data controller. In those instances, most of the legal obligations are  mainly the responsibility of the data controller, which usually is your client.

However, the determination on whether you are the processer depends on the level of freedom you have in choosing the purpose and methods of processing the data. If you determine the purpose, types of data and methods applied, you cannot argue that you are just the processor, as you in fact ‘control’ the data processing.

Having a data processing agreement with a client – or adding a section on data processing to your existing agreement – is an important prerequisite to be regarded as the processor. To be regarded as the processor, the agreement should clearly show that you are instructed for a specific purpose, looking at specific types of data and applying specific methods (and excluding anything else) and reporting in a specific way. Once more, consult your friendly lawyer for more details.

There is limited jurisprudence on GDPR issues and this means, that in many instances it is not exactly sure how the GDPR will be interpreted exactly. The principles should give some guidance, nonetheless, in high profile cases make sure you discuss these matters with your client and (their) inhouse legal counsel.

We have given a general overview of the most relevant aspects of the GDPR for OSINT work. However, we realise that we are not complete in that matter. The GDPR has a number of other relevant articles, for example on processing of special categories of personal data, but there are limits to what we can cover in one blog post.

So, get your copy of the GDPR today as well as a copy of the implementing law in your country, read it, seek advice, understand it and most important: Comply with it!

Ludo Block & Matthias Wilson (I just chipped in a little)/ 11.06.2019