Overview of Big Data and Machine Learning Biometric Methods
5 most popular methods of biometric identification
Modern methods of biometrics:
• recognition of physiological signs of the human body that do not change significantly over time and remain with their carrier throughout its life (fingerprints, face, iris and retina, palms, ears, DNA);
• study of behavioral characteristics, the dynamics of which have been constant for a long time due to the constant repetition of these processes (speech, handwriting, manner of typing on the keyboard, gait).
In practice, the following biometric methods are most often used:
• fingerprints;
• face recognition (two-dimensional and three-dimensional);
• images of the iris and retina;
• drawing veins in the palm of your hand.
From ears to tail: a brief overview of other biometric methods
The development of science and technology leads to the emergence of new ways to establish identity based on physiological characteristics and behavioral characteristics. In particular, in 2017, it was scientifically proven that sweat secretions are unique for each person and it is impossible to fake the chemical composition of these amino acids. Therefore, sweat, like blood, can accurately and unmistakably identify a person.
In 2013 and 2014, methods of biometrics based on breath odour and even from clean hands were also proposed. In 2017, a method was developed for detecting people by finger micro-vibrations, which is almost as accurate as fingerprinting or retinal scanning, but costs 10 times less.
In 2019, the Jetson infrared scanner entered the market, with which you can identify a person by the speed of their heartbeat at a distance of up to 200 meters with an accuracy of 95%. This is more reliable than facial recognition and fingerprint systems, which can be faked, but much slower. To analyze the heartbeat, you need about half a minute, while the person should be lightly dressed and motionless. Similar biometrics methods based on user heart activity analysis were proposed in 2017 and 2014.
The auricles can also act as unique identity identifiers, since this part of the body is not repeated even in identical twins. Such a biometric method with an accuracy of 99.6% is advisable to use in everyday individual tasks, for example, to unlock smartphones. Since 2015, Big Data behavioral biometrics systems have been actively developed, based on the analysis of gait, lip movement during conversation, voice and speech manner, and even the features of working on the keyboard. As a rule, such cognitive characteristics are used as additional, rather than the main parameters of biometric identification. Behavioral biometrics also includes the collection of data about user behavior on the Internet, which is carried out using cookies. Recall that in accordance with the requirements of the GDPR, it is necessary to warn the user about the collection of such information and obtain their consent.
DNA can be used to determine a person 100% accurately, but this biometric method is the most complex and expensive.
Top 10 Criteria for Choosing Big Data and Machine Learning Biometrics
The effectiveness of biometric methods depends on the conditions of its application. Therefore, when choosing biometric parameters for personal identification, the following factors should be considered:
• identification time – how many minutes will be required for the correct collection of biometric data and their recognition.
• ease of collecting biometric information: for example, a blood or sweat test will accurately identify a person, however, the procedure for taking these biological fluids requires special equipment and time.
• the breadth of coverage, for example, the throughput of Big Data biometrics systems based on online analytics of the flow of people online is significantly higher than when scanning fingers, palms, eyes or other parts of the body of each person individually.
• a method of collecting bio-data – while contact methods are more accurate than non-contact ones, but the former require more time and have a lower throughput.
• resistance to falsification, which implies low rates of FAR and FMR. This means that an error of the first kind (false-positive solution) is unlikely when the ML model mistakenly identifies a person by recognizing real BAPs matching the template of another person.
• immunity to interference, which means low FRR and FNMR and a small likelihood of type 2 errors and denial of service due to the fact that Machine Learning algorithm could not recognize a legitimate user due to interference or poor quality of the presented data.
• the necessary infrastructure – Big Data tools for storing biometric templates (for example, HBase in Apache Hadoop or other NoSQL-DBMS), analytical processing frameworks using Machine Learning algorithms (Apache Spark, Flink), as well as sensors and other intelligent devices of the Internet of things ( Internet of Things, IoT), which will collect and transmit digitized biometric data for reconciliation with templates in the database. Sometimes this entire infrastructure can fit within a single smartphone, and large biometric systems such as the Indian AADHAAR or Russian EBS require large-scale distributed solutions based on highly reliable clusters.
• general reliability of the biometrics system, which is characterized by a low value of all recognition quality metrics (FAR, FRR, FMR, FNMR, FER, FTC), and low values of RTO, RPO and other SRE indicators. Here, one should take into account not only Machine Learning algorithms, but also the work of IoT / IIoT equipment, data transmission channels, routing technologies, information backup tools, cluster load balancing and other components of the large-scale Big Data system.
• application context, which includes the location and features of using the biometrics system, for example, street surveillance cameras, airport control, security checkpoints, online banking, telemedicine, etc.
• implementation cost – the most accurate, but also the most expensive are multi-factor systems that use a combination of several methods of biometrics, for example, scanning of the palms of the hands, retina of the eye and features of movement.
Multi-factor biometric systems are the most reliable, but expensive.
In the next article, we will explain how biometrics methods are resistant to falsification, considering several real cases of circumventing biometric systems based on Big Data and Machine Learning.