KNOW IT IF YOU DON'T

This blog represents my opinions on matters around the globe.

Follow Me

Cyber Security - Shafqat Writes

Cyber Security: What if Data is Hacked, Poisoned, or Tempered?

Vice President and Global Chief Information Security Officer (CISO) once said,

“If you think you know it all about CYBER SECURITY, this discipline is probably ill-explained to you.”

Hi readers, did you enjoyed reading about data and Artificial Intelligence? Hope you are now expert in this area especially in big data? You must also be familiar with the word “Hacking”? My data is hacked, NADRA data is hacked, FBR data is hacked, NBP data is hacked are some of the common phrases that you must have heard in the near past. What does this mean? You must also be thinking about bigdata because you have been reading this word in the last 2-3 blogs? The question must also be arising in your mind

What if this data is hacked, poisoned or tempered? 

Remember! I promised at the end of the previous blog to tell you an amazing story about what the Hecker can do with the data?

“The hackers (cyber security broacher) could tamper with the data set (software) used to train AI “chatbots” and direct it to do the way they like?

AI is thus, an opportunity as well a threat. It has not only augment existing strategies for offence and defense, but has also opened new fronts for cyber security as “smart” people can learn new ways to exploit weakness in the technology. As a part of their long-term game, they can start gathering and/or acquiring whatever data is available to be used in future for purposes not known today. The ability of AI systems to make use of in-offensive and safe data is tempting people to use “data hoovering” which means harvesting whatever information one can gather and storing it for future strategic uses, even if that use is not well defined at present.

For example, we know that AI chatbots has been trained on data sets containing text recordings of human conversation. It is highly likely, that a chatbot accidently record a conversation about some assets (strategic or otherwise), its inherent ability will start assessing

what kind of information constitutes a “useful asset”? and will thus, transform troves of information to eventually made it a tempting target for hackers and/or hostile actors (state or non-state)

Now imagine for a while, what will happen if “adversarial effects” (an inputs to machine learning models that attackers intentionally design to force the model to make a mistake)  modify that data which is to be used in future as input data for AI? The moment the scenario change, the trained chatbot will direct the troves of big data (modified through “adversarial effects”) to prepare for appropriate response for the tempting targets that would not previously have been of interest for the hostile actors who can disrupt, inflict damage, create devastation and/or confusion to the extent that they can capture strategic assets such as intellectual property. Anyone can wonder if this is an illusion? or if it can factually happen?

Yes, it has already happened: the astonishing story of which is placed below.

Paulo Shakarian, Jana Shakarian, and Andrew Ruef published in 2013, a case study of “Cyber War” through Intellectual Property Theft: Operation Aurora. On January 12th, 2010, Google announced that it had been the victim of a cyber-warfare originating from China to access the Gmail (email) accounts of Chinese human rights activists (this is what they suspected but could not prove). Consequently, Google announced that it would no longer censor results on its flagship search engine in China “google.cn” and if not allowed, google will close her operations which triggered alarm bells in China. Immediately after this, “Adobe” announced that their corporate systems had also been hacked and it was speculated that both Google and Adobe were targets of the same adversary who also conducted the same operation against thirty-two other companies in the US. Details of this cyber-espionage identified a vulnerability in “Microsoft Internet Explorer” that was exploited by software referred to as “Trojan Hydraq” by the security firm Symantec.

Operation Aurora was initiated with “spear phishing (a type of attack that targets specific individuals or organizations typically through malicious emails to inflict damage to the target’s device) directed at an employee using the Microsoft Messenger Instant Chat Software who supposedly received a link to a malicious website during one of his chats. It is unknown if the operations against the other firms were also initiated with this chat software? It is assumed that emails may also have been used to initiate the infiltration of the malicious software. The initial communication to these firms had three characteristics:

  • First, they were sent to a select group of individuals, which suggests that the hackers had some additional source of intelligence on their targets,
  • Second, the communications were engineered in a way to appear from a trusted source, which also showed that the perpetrators were operating with profiles of their targets, and
  • Third, they all contained a link to a website which when clicked upon, initiated series of events. It executed malicious “JavaScript code” that runs Theft of Intellectual Property.

Citing an unnamed source with direct knowledge of the Google investigation, New York Times reporter wrote that the source code of Google’s state of the art password system “Gaia”, had likely been stolen during Operation Aurora. The system was designed to allow users of Google’s software “Single Sign-On” to use a single username and password to access innumerable Google services.

Theft of Gaia was said to be of three-fold significance

  • First, obtaining software source code of a commercial system is intellectual property theft which is illegal in the US where a case can be registered but against whom?
  • Second, theft of code could allow certain developers to illicitly create software like Gaia? and
  • Third, if Operation Aurora is state sponsored (as is speculated) theft of intellectual property can be considered a form of economic warfare by developing the technology for leveling the playing field and reducing industrial capability to the advantage of the rival and/or adversary nation.

Beyond this, the theft of source code particularly of Gaia could have major security implications. Analysist working on this theft raised an important question. How were the attackers able to obtain source code for a system such as Gaia by leveraging a relatively small number of cooperated computer systems? It turned  out that many corporations work with specialized servers as storehouses for this type of data which is stored as “intellectual property repositories.” Centralized locations of these repositories make it easier for teams to work collaboratively on a project and share information with each other. The professionals operating those networks assumed that the intellectual property would not be accessed due to security countermeasures taken to protect the network, but Operation Aurora invalidated/annulled this key assumption made by system administrators and IP repository software vendors. They were less focused on the security of IP repository lying within the perimeter of a corporation’s network.

By utilizing a zero-day vulnerability (a vulnerability that was disclosed but not yet repaired or patched). Operation Aurora was able to exploit this assumption.

Theft of intellectual property doesn’t allow determining what was stolen. As appose to physical theft where the stolen articles can easily be determined/located, in cyber-espionage and data-exfiltration, it is much more difficult to establish “what was stolen”? In advanced cyber-espionage, hackers often take various steps to cover-up their tracks and operate in a manner, which makes it difficult to ascertain what data was stolen. Soft wares are now available to solve such problem but identifying “what was stolen” in cyberespionage operation is still a difficult task.

Examples of the types of “Operation Aurora” made many people feel “if AI and ML are safe”?  and “if these are the activities for which technologies are being developed with heavy investment”?  Such apprehensions for AI have also been felt earlier when big data was not available and will continue to be so even today which still raises skepticism for misinformed people for whom there is no dearth in the society. But those who are well informed and are stakeholders and experts (not many really), are busy weighing prose and cones and the prose are weighing heavier than cones. Hence, the system is continuously being improved through making innovations in soft wares depending upon the pace of generation and of availability of new data.

The crux is that it all depends upon the  a) speed at which the data is being generated, b) ability of the ML technique to get pattern out of this data and c) using the pattern for automatically updating its software which help machines to analyze the input data in more innovative ways never used before: the cycle in which 

“humans are not seen anywhere even if they are involved” is thus goes on and no one know what the end would be?

Jeffery Deaver said in “The Blue No Where”

As a matter of fact, it is foolproof

The problem is that you don’t have to protect yourself against fools,

you have to protect yourself against people like me. OK

See you next time with another amazing story: Is AI a threat to humanity?

Bye

Leave a Reply