<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://projectswiki.eleceng.adelaide.edu.au/projects/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=A1798520</id>
	<title>Projects - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://projectswiki.eleceng.adelaide.edu.au/projects/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=A1798520"/>
	<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php/Special:Contributions/A1798520"/>
	<updated>2026-06-01T02:38:48Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.31.4</generator>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17603</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17603"/>
		<updated>2021-10-28T16:22:37Z</updated>

		<summary type="html">&lt;p&gt;A1798520: /* Results */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Feature !! Meaning !! unit&lt;br /&gt;
|-&lt;br /&gt;
| Heart rate || number of heart beats per minute || bpm&lt;br /&gt;
|-&lt;br /&gt;
| Mean interval || the mean value of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDNN || standard deviation of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDSD || standard deviation of difference beat-to-beat intervals  || ms&lt;br /&gt;
|-&lt;br /&gt;
| RMSSD || root mean square of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| NN50 || the number of intervals that greater than 50 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN50 || the percentage of intervals that greater than 50 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| NN20 || the number of intervals that greater than 20 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN20 || the percentage of intervals that greater than 20 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| ShE || shannon entropy of heart beats || du&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 10&lt;br /&gt;
|}&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, multi-domain, statistic and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.6: Features in multi-domain and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognize patterns in our ECG recordings. We chose SqueezeNet to classify ECGs since it is the smallest pre-trained CNN but still get high performance, making it possible to be deployed on limited-memory hardware. &lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In addition, 10 HVR features from time-domain were proven to be the most important features for SVM since it performed just a little lower than 169 feature-base SVM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.788&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.793&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.675&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.836&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.812&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17461</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17461"/>
		<updated>2021-10-24T15:44:41Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Feature !! Meaning !! unit&lt;br /&gt;
|-&lt;br /&gt;
| Heart rate || number of heart beats per minute || bpm&lt;br /&gt;
|-&lt;br /&gt;
| Mean interval || the mean value of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDNN || standard deviation of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDSD || standard deviation of difference beat-to-beat intervals  || ms&lt;br /&gt;
|-&lt;br /&gt;
| RMSSD || root mean square of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| NN50 || the number of intervals that greater than 50 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN50 || the percentage of intervals that greater than 50 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| NN20 || the number of intervals that greater than 20 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN20 || the percentage of intervals that greater than 20 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| ShE || shannon entropy of heart beats || du&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 10&lt;br /&gt;
|}&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, multi-domain, statistic and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.6: Features in multi-domain and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognize patterns in our ECG recordings. We chose SqueezeNet to classify ECGs since it is the smallest pre-trained CNN but still get high performance, making it possible to be deployed on limited-memory hardware. &lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In addition, 10 HVR features from time-domain were proven to be the most important features for SVM since it performed just a little lower than 169 feature-base SVM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17460</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17460"/>
		<updated>2021-10-24T15:23:03Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Feature !! Meaning !! unit&lt;br /&gt;
|-&lt;br /&gt;
| Heart rate || number of heart beats per minute || bpm&lt;br /&gt;
|-&lt;br /&gt;
| Mean interval || the mean value of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDNN || standard deviation of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDSD || standard deviation of difference beat-to-beat intervals  || ms&lt;br /&gt;
|-&lt;br /&gt;
| RMSSD || root mean square of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| NN50 || the number of intervals that greater than 50 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN50 || the percentage of intervals that greater than 50 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| NN20 || the number of intervals that greater than 20 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN20 || the percentage of intervals that greater than 20 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| ShE || shannon entropy of heart beats || du&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 10&lt;br /&gt;
|}&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, multi-domain, statistic and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.6: Features in multi-domain and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognise patterns in our ECG recordings.&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In addition, 10 HVR features from time-domain were proven to be the most important features for SVM since it performed just a little lower than 169 feature-base SVM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17457</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17457"/>
		<updated>2021-10-24T15:19:39Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! No. !! Feature !! Meaning !&lt;br /&gt;
|-&lt;br /&gt;
| Heart rate || number of heart beats per minute || bpm&lt;br /&gt;
|-&lt;br /&gt;
| Mean interval || the mean value of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDNN || standard deviation of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDSD || standard deviation of difference beat-to-beat intervals  || ms&lt;br /&gt;
|-&lt;br /&gt;
| RMSSD || root mean square of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| NN50 || the number of intervals that greater than 50 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN50 || the percentage of intervals that greater than 50 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| NN20 || the number of intervals that greater than 20 ms || du&lt;br /&gt;
|-&lt;br /&gt;
| pNN20 || the percentage of intervals that greater than 20 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| ShE || shannon entropy of heart beats || du&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 10&lt;br /&gt;
|}&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, multi-domain, statistic and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.6: Features in multi-domain and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognise patterns in our ECG recordings.&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In addition, 10 HVR features from time-domain were proven to be the most important features for SVM since it performed just a little lower than 169 feature-base SVM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17453</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17453"/>
		<updated>2021-10-24T15:13:23Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! No. !! Feature !! Meaning !! unit &lt;br /&gt;
|-&lt;br /&gt;
| Heart rate || number of heart beats per minute || bpm&lt;br /&gt;
|-&lt;br /&gt;
| Mean interval || the mean value of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDNN || standard deviation of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| SDSD || standard deviation of difference beat-to-beat intervals  || ms&lt;br /&gt;
|-&lt;br /&gt;
| RMSSD || root mean square of beat-to-beat intervals || ms&lt;br /&gt;
|-&lt;br /&gt;
| pNN50 || the percentage of intervals that greater than 50 ms || %&lt;br /&gt;
|-&lt;br /&gt;
| normRMSSD || normalized RMSSD to the mean interval || ms&lt;br /&gt;
|-&lt;br /&gt;
| normSDSD || normalized SDSD to the mean interval || ms&lt;br /&gt;
|-&lt;br /&gt;
| ShE || shannon entropy of heart beat || du&lt;br /&gt;
|-&lt;br /&gt;
| RRVN || variance of R-R interval divided by squared mean interval || ms&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 10&lt;br /&gt;
|}&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, multi-domain, statistic and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.6: Features in multi-domain and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognise patterns in our ECG recordings.&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In addition, 10 HVR features from time-domain were proven to be the most important features for SVM since it performed just a little lower than 169 feature-base SVM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17407</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17407"/>
		<updated>2021-10-24T14:33:03Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognise patterns in our ECG recordings.&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In addition, 10 HVR features from time-domain were proven to be the most important features for SVM since it performed just a little lower than 169 feature-base SVM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17189</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17189"/>
		<updated>2021-10-24T11:07:55Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. In this project, ECG recordings were collected from the PhysioNet Database&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt;, and have been classified using existing ML techniques.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest is the electrocardiogram (ECG), which are signals are collected by placing electrodes on the skin around the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we explored various methods of classifying ECGs, and pre-processing methods to improve this.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to:&lt;br /&gt;
* Investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patterns;&lt;br /&gt;
* Extend this to distinguishing between different heart diseases; and,&lt;br /&gt;
* Find a reasonably good method to do this.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
ECGs represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf&amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest, since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles) which push blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on the ECG&amp;lt;ref name=SK_B/&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such, artefacts (for example from motion or breathing) and noise, are often overlaid on the ECG as well. This can make it harder for a physician to distinguish, hence, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just atrial fibrillation (AF) was considered. AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so the methods and results produced by a number of previous studies were able to be examined. This section also quickly discusses the processes and results found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;M. Dorraki, A. Fouladzadeh, A. Allison, B.R. Davis and D. Abbott; On moment of velocity for signal analysis, in Royal Society Open Science, vol. 6, issue 3, 2019, Available: https://royalsocietypublishing.org/doi/full/10.1098/rsos.182001&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is usually quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is critical to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between them. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and other intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1 Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis, wavelet decomposition decomposes the signal into a sum of wavelets&amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even separating the abnormal class into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actual classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with Class 1 in red and Class 2 in blue. If a new data point, as shown by the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance for the new task.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendency to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; developed a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. Figure 3.1 describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since various machine learning techniques were investigated, as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.1: ECG classification methodology.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG and MathWorks Example ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.2 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the expected locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.4 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.4: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;MATLAB ECG Wavelet Classification&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
An example from MathWorks demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html&amp;lt;/ref&amp;gt;. The wavelet feature extraction transforms the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into a training set and a test set. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. This was a suitable starting point from which to compare later results.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we investigated the effects of two other filtering methods discussed on the literature on ECG classification. Wavelet denoising and Moment of Velocity were applied to the same dataset, then the raw dataset and these cleaned versions were fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models. The SVM was chosen due to its relative simplicity, the CNN was selected as it is effective at analysing images such as spectrograms, and the LSTM network was chosen as it is simpler than other neural networks like the CNN, but still shares some of its advantages.&lt;br /&gt;
&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF can be distinguished from other heart rhythms by analysing the beat-to-beat intervals of an ECG recording. With that aim, we performed feature-extraction to find information about heartrate variability (HRV), before using the SVM to recognise the pattern of AF signals. Figure 3.6 shows the receiver operating characteristics (ROC) of the SVM when run for each of the 3 pre-processing options, using HRV feature extraction. The closer the ROC curve hugs the top left corner, the better the classification. Hence the wavelet denoising was the most effective pre-processing technique in this case.&lt;br /&gt;
&lt;br /&gt;
According to Andreotti et al.&amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt;, HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, these features were also tested with the SVM algorithm. We developed our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features, shown in Table 3.5. The ROC curve for each pre-processing option with these features is shown in Figure 3.7.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 3.5: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&amp;lt;ref name=LN_F/&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=400px&amp;gt;&lt;br /&gt;
File:SVM HRV AF.png|&amp;#039;&amp;#039;Figure 3.6: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HRV features.&amp;#039;&amp;#039;&lt;br /&gt;
File:SVM TS AF.png|&amp;#039;&amp;#039;Figure 3.7: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempted to classify the data without extracting any features, which serves as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units (meaning each signal is mapped to 100 features) and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal ECG, AF, and other abnormality. The training progress is shown in Figure 3.8. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (over 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction was used to improve these results. By default, the program extracted the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure 3.9. Attempts were made to alter the features extracted, however this either led to errors or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure 3.8: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure 3.9: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
According to Gajendran et al.&amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular systems. Transfer learning involves using models that were previously trained on large amounts of general images, to then learn from our dataset, as demonstrated in Figure 3.10. An advantage of this method is that the model does not need to be built and trained from scratch, as this is time-consuming and requires a large dataset. However, the model still needed to be trained and fine-tuned to recognise patterns in our ECG recordings.&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|centre|&amp;#039;&amp;#039;Figure 3.10: Transfer Learning flow chart.&amp;#039;&amp;#039;&amp;lt;ref name=LN_M/&amp;gt;]]&lt;br /&gt;
The ROC curve of the results from this classifier for each pre-processing technique is shown in Figure 3.11. In this project, we modified the code from MathWorks using transfer learning [https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html here] &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; .&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure 3.11: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted, the stronger the diagonal in the confusion matrix, and the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that the CNN and SVM using 169 features outperformed the other classification methods, especially when wavelet denoising was used. The LSTM also got a high result with wavelet denoising, however, it used instantaneous frequency and spectral entropy which are sensitive to noise. In addition, MoV got rid of certain low frequency components, and hence negatively impacted the features, resulting in low performance in all classifiers. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows the ROC curve for the best result from each classification method. It demonstrates that the multi-feature SVM and the CNN rank very closely, and are notable better than the other classification methods investigated.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model can be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to identify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17071</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=17071"/>
		<updated>2021-10-24T03:51:07Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;u&amp;gt;&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;/u&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM HRV AF.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HVR features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:SVM TS AF.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
In this experiment, we modified the code from MathWorks using transfer learning here &amp;lt;ref name=LN_CNN&amp;gt;The MathWorks, Inc.; &amp;#039;&amp;#039;Classify Time Series Using Wavelet Analysis and Deep Learning&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/wavelet/ug/classify-time-series-using-wavelet-analysis-and-deep-learning.html&amp;lt;/ref&amp;gt; &lt;br /&gt;
&lt;br /&gt;
[[File:SqueezeNetAchitecture.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Architecture of SqueezeNet.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table 4.1 and Figures 4.2 and 4.3 below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table 4.1 and Figure 4.2 below, using the F1-score of the AF class. These results demonstrate that in general the CNN and SVM using 169 features outperformed the other classification methods. LSTM also got a high result with wavelet denoising. However, it used instantaneous frequency and spectral entropy which are sensitive with noise. In addition, MoV got rid of certain low frequency component, hence negatively impacted the two features, resulting in low performance of LSTM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Figure 4.3 shows...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery heights=350px mode=packed&amp;gt;&lt;br /&gt;
File:F1 Scores of Results.png|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;&lt;br /&gt;
File:FinalPerformance.png|&amp;#039;&amp;#039;Figure 4.3: Robustness comparison between various classifiers.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
So, can we teach a machine to be a cardiologist? The short answer is yes. In terms of teaching a machine to accurately recognise different heart conditions by analysing the ECG recording of patients, yes this is entirely possible, as our results have shown. It is also worth mentioning that results in the literature have achieved higher results than ours, so with a deeper understanding and more fine-tuning, a highly reliable model could be created.&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by modifying the combination of pre-processing, feature extraction and classification to find the optimal solution, or by finding different methods of each of these processes which is better suited to the data. Our model was designed to identify AF from normal and other abnormal conditions, but the classifier could be extended to classify a greater range of cardiovascular conditions.&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SqueezeNetAchitecture.png&amp;diff=17070</id>
		<title>File:SqueezeNetAchitecture.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SqueezeNetAchitecture.png&amp;diff=17070"/>
		<updated>2021-10-24T03:50:05Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;squeezenet architecture&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16856</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16856"/>
		<updated>2021-10-21T19:18:20Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM HRV AF.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HVR features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:SVM TS AF.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarized in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN and SVM using 169 features outperformed the other classification methods. LSTM also got a high result with wavelet denoising. However, it used instantaneous frequency and spectral entropy which are sensitive with noise. In addition, MoV got rid of certain low frequency component, hence negatively impacted the two features, resulting in low performance of LSTM. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
[[File:FinalPerformance.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Robustness comparison between various classifiers.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16855</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16855"/>
		<updated>2021-10-21T18:44:44Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM HRV AF.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HVR features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:SVM TS AF.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
[[File:SqueezeNet.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || HVR || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || HVR || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || HVR || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8135&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.8357&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity || Time and Frequency Domain, Signal Quality, and Non-linear and Morphological Features || 0.7597&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
[[File:FinalPerformance.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Robustness comparison between various classifiers.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16854</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16854"/>
		<updated>2021-10-21T18:32:58Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM HRV AF.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HVR features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[File:SVM TS AF.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
[[File:FinalPerformance.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of CNN models using raw/wavelet/MoV denoising techniques and Scalogram.&amp;#039;&amp;#039;]]&lt;br /&gt;
]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16853</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16853"/>
		<updated>2021-10-21T18:30:58Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
[[File:SVM HRV AF.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and HVR features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[File:SVM TS AF.png|thumb|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of SVM models using raw/wavelet/MoV denoising techniques and multiple features.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16852</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16852"/>
		<updated>2021-10-21T18:25:28Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
[[File:SVM HRV AF.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of models using raw/wavelet/MoV denoising.&amp;#039;&amp;#039;]]&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16851</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16851"/>
		<updated>2021-10-21T18:23:37Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
[[File:SVM HRV AF.png|thumb|&amp;#039;&amp;#039;Figure X: ROC and AUC of AF class of models using raw/wavelet/MoV denoising.&amp;#039;&amp;#039;]]&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SVM_HRV_AF.png&amp;diff=16850</id>
		<title>File:SVM HRV AF.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SVM_HRV_AF.png&amp;diff=16850"/>
		<updated>2021-10-21T18:15:08Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;af scores on svm hrv&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SVM_TS_AF.png&amp;diff=16849</id>
		<title>File:SVM TS AF.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SVM_TS_AF.png&amp;diff=16849"/>
		<updated>2021-10-21T18:14:29Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;af scores on svm multifeat&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SqueezeNet.png&amp;diff=16848</id>
		<title>File:SqueezeNet.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:SqueezeNet.png&amp;diff=16848"/>
		<updated>2021-10-21T18:13:47Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;cnn on datatypes&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:FinalPerformance.png&amp;diff=16847</id>
		<title>File:FinalPerformance.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:FinalPerformance.png&amp;diff=16847"/>
		<updated>2021-10-21T18:13:08Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;AF scores&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16846</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16846"/>
		<updated>2021-10-21T11:38:45Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16845</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16845"/>
		<updated>2021-10-21T11:37:02Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
[[File:File:Methodology.drawio.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: ECG classification.&amp;#039;&amp;#039;]]&lt;br /&gt;
Figure X describes the flow chart to identify AF between normal signals, starting from data preparation to pre-processing, feature-engineering, ending with classification performance. There is a loop from filtering signals to classification assessment since we will investigate various machine learning techniques as well as the most appropriate denoising method for AF detection.&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16844</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16844"/>
		<updated>2021-10-21T11:24:39Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
[[File:TransferLearning.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Transfer Learning flow chart.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16843</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16843"/>
		<updated>2021-10-21T11:22:35Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
File:TransferLearning.png&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:TransferLearning.png&amp;diff=16842</id>
		<title>File:TransferLearning.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:TransferLearning.png&amp;diff=16842"/>
		<updated>2021-10-21T11:12:30Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;tl&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16841</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16841"/>
		<updated>2021-10-21T09:53:13Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_F&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number &lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
We develop our own algorithm for selecting and extracting HVR and using a tool named ExtractFeatures.m provided by &amp;lt;ref name=LN_FF&amp;gt;F. Andreotti, Access, 2017; [Online]. Available: https://github.com/fernandoandreotti/cinc-challenge2017/tree/master/featurebased-approach&amp;lt;/ref&amp;gt; to extract 169 features.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
According to &amp;lt;ref name=LN_M&amp;gt;M. K. Gajendran and et al, ECG Classification using Deep Transfer Learning, in IEEE Access, 2021; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9476957&amp;lt;/ref&amp;gt;, transfer learning techniques can be applied to detect abnormality in cardiovascular system. Transfer learning is using the pre-train models that were already trained on large amount of general images to learn from our own dataset. An advantage of this method is that we do not need to build and train our own model from scratch which is time-consuming and require a lot of images. However, we still need to train and fine-tune the model so that it can be able to recognize patterns in our recordings.&lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16816</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16816"/>
		<updated>2021-10-21T03:24:34Z</updated>

		<summary type="html">&lt;p&gt;A1798520: /* Results */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An LSTM network is a type of recurrent neural network (RNN) which is well-suited to classifying time-series data. They are an improvement over traditional RNNs which suffer from short-term memory, and hence have a tendancy to &amp;quot;forget&amp;quot; what was seen earlier in longer sequences&amp;lt;ref name=SK_LS&amp;gt;M. Phi; 2018; Illustrated Guide to LSTM’s and GRU’s: A step by step explanation; [Online], Available: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21&amp;lt;/ref&amp;gt;. LSTM networks have the ability to keep or forget information as training progresses, enabling them to effectively analyse long sequences of data by retaining only the important information. The structure of an LSTM unit is shown in Figure 2.13.&lt;br /&gt;
&lt;br /&gt;
LSTM networks have been used to successfully classify ECG arrhythmias&amp;lt;ref name=SK_LL&amp;gt;B. Hou, J. Yang, P. Wang, R. Yan, LSTM-Based Auto-Encoder Model for ECG Arrythmias Classification, in IEEE Transactions on Instrumentation and Measurement, vol. 69, issue 4, 2020, [Online], DOI: 10.1109/TIM.2019.2910342&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LT&amp;gt;S. Saadatnejad, M. Oveisi, M. Hashemi, LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices, in IEEE Journal of Biomedical and Health Informatics, vol. 24, issue 2, 2020, [Online], DOI: 10.1109/JBHI.2019.2911367&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_LM&amp;gt;O. Yildirim, A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification, in Computers in Biology and Medicien, vol. 96, pp 189-202, 2018, [Online], Available: https://doi.org/10.1016/j.compbiomed.2018.03.016&amp;lt;/ref&amp;gt;. Hou et al.&amp;lt;ref name=SK_LL/&amp;gt; used an LSTM network with an SVM to classify between 5 classes of ECGs with sensitivities and specificities above 95%. Saadatnejad et al.&amp;lt;ref name=SK_LT/&amp;gt; proposed an LSTM classifier for wearable cardiac monitoring. Their algorithm was found to be both accurate and less computationally intensive than other deep learning approaches. Yildirim&amp;lt;ref name=SK_LM/&amp;gt; used a novel approach using a bidirectional LSTM network and wavelet sequence to classify ECG signals, and reported a high recognition performance of 99.25%.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:LSTM Structure.gif|&amp;#039;&amp;#039;Figure 2.13: LSTM Unit Structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_LL/&amp;gt;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
==== Long Short-Term Memory ====&lt;br /&gt;
An example from MathWorks using an LSTM model was identified&amp;lt;ref name=MW_LSTM&amp;gt;The MathWorks, Inc.; 2017; &amp;#039;&amp;#039;Classify ECG Signals Using Long Short-Term Memory Networks&amp;#039;&amp;#039;; Available: https://au.mathworks.com/help/signal/ug/classify-ecg-signals-using-long-short-term-memory-networks.html&amp;lt;/ref&amp;gt;. Although this also used the PhysioNet database&amp;lt;ref name=PhysioNet/&amp;gt;, we modified it to use the data we had collected and pre-processed.&lt;br /&gt;
&lt;br /&gt;
Running this code, it first attempts to classify the data without extracting any features, which will be used as a comparison later. This classifier runs a bidirectional LSTM layer, meaning it looks at the data in both the forward and backward directions. The bidirectional LSTM layer is specified with 100 hidden units, meaning each signal is mapped to 100 features, and then prepares the output for the fully-connect layer (neural network). Three classes are output, being normal, AF, and other abnormality. The training progress is shown in Figure X. Notice that this sits around 40% accuracy, and takes a reasonable amount of time to run (about 20 minutes in this case).&lt;br /&gt;
&lt;br /&gt;
Next, feature extraction is used to improve these results. By default, the program extracts the instantaneous frequency and entropy of the signals. The instantaneous frequency estimates the time-dependent frequency of a signal, and the spectral entropy measures how spikey/flat the signal is. By extracting these features the 3000-sample signals are reduced to a 2-by-63 vector. The LSTM used is the same as in the first case, although it now runs significantly faster and achieves a more accurate result, as shown in Figure X. Attempts were made to alter the features extracted, however this either led to errors, or extremely poor results, and so is not shown here.&lt;br /&gt;
&lt;br /&gt;
This feature extraction process was completed for the raw ECG signals, the wavelet denoised ECG signals, and the MoV of the ECGs. The results are shown in the results section.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:LSTM on raw ECG data.png|&amp;#039;&amp;#039;Figure X: LSTM Training using Raw ECG Data.&amp;#039;&amp;#039;&lt;br /&gt;
File:LSTM with feature extraction.png|&amp;#039;&amp;#039;Figure X: LSTM Training with Feature Extraction.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. The results are summarised in Table X and Figure X below. In order to compare the results, a single measure which suitably describes the results was needed. Accuracy may seem like an obvious choice, but it can be misleading. For example, in real-world systems where a sample set may contain 98 normal cases and 2 abnormal cases, 99% accuracy could be achieved by classifying all normal cases and one of the abnormal cases as normal. But, this would mean that one of the abnormal cases are missed, which could be catastrophic in the case of a life-threatening illness. For this reason, the F1-score was used instead. The F1-score conveys the balance between the precision (true positives divided by true positives and false positives) and recall (true positives divided by true positives and false negatives) of the model. So in this example, the F1-score of identifying the abnormal case would be 66.7%, which is significantly lower than the accuracy, but gives far more meaning to the results.&lt;br /&gt;
&lt;br /&gt;
In each case, the results were displayed as a confusion chart, such as the one in Figure X. The confusion chart shows the predicted classes in comparison to the true classes of the data. It is a useful tool for understanding how the classifier is behaving, and where issues may be occurring. The better each class is predicted (the stronger the diagonal in the confusion matrix), the better the performance of the classifier.&lt;br /&gt;
&lt;br /&gt;
Our findings are summarised in Table X and Figure X below, using the F1-score of the AF class. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.785&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.7935&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.6752&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || 0.507&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16774</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16774"/>
		<updated>2021-10-21T00:14:57Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref name=SK_B&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B/&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ECG PSD.jpg|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of comparison of Normal and AF ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm.&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins.&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref name=SK_B/&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;&amp;lt;ref name=SK_E&amp;gt;R. Gholami, N. Fakhari, Support Vector Machine: Principles, Parameters, and Applications, in Handbook of Neural Computation, 2017, pp 515-535; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780128113189000272&amp;lt;/ref&amp;gt;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;. Huang et al.&amp;lt;ref name=SK_R/&amp;gt; reported a 99% accuracy when using a 2D-CNN, but only a 90% accuracy for the 1D-CNN, demonstrating the power of classification based on spectral data. Similarly, Rashed-Al-Mahufuz et al.&amp;lt;ref name=SK_S/&amp;gt; classified scalogram images using a VGG16 architecture, a type of CNN with 16 layers. This method had close to 100% accuracy when distinguishing between both four or six classes of heart condition. Finally, Lih et al.&amp;lt;ref name=SK_W/&amp;gt; made use of an LSTM model along with the CNN to improve their results. Even with noisy signals, this was able to achieve high accuracy (97.33%), although it was time-consuming and required a sizeable amount of data. Furthermore, it was recommended that a pre-trained model with high performance at a related task could be used to reduce computational complexity&amp;lt;ref name=SK_S/&amp;gt;. Parts of the classifier can then be modified as needed to improve its performance.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
see figure 2.13 when we add it&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
|&amp;#039;&amp;#039;Figure 2.13: Example LSTM structure.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
AF is an abnormality of the heart rhythm, making the heart beats chaotically and irregularly compared to normal rhythm. Therefore, it is possible to distinguish AF from other rhythm by analyzing beat-to-beat intervals of a recording. With that aim, we will perform feature-engineering that extract information about heartrate variability, and use SVM to recognize the pattern of AF signals.&lt;br /&gt;
&lt;br /&gt;
==== Recurrent Neural Network with Long Short-Term Memory ====&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
=== Comparing Results ===&lt;br /&gt;
In order to understand how different methods compare to one another, a parameter which could give a good representation of how well each classifier performed was needed. We decided to use the F1-score for this purpose...&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. From our findings, the F1-score of each is summarised in Table 4.1 and Figure 4.2 below. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
Although the CNN produced the highest results, the LSTM holds an advantage of being quicker and less computationally intensive to use, whilst still being notably more effective than the SVM classifier.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot; style=&amp;quot;margin-left: auto; margin-right: auto; border: none;&amp;quot;&lt;br /&gt;
|+ &amp;#039;&amp;#039;&amp;#039;Table 4.1: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.469&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.485&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.483&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || &lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure 4.2: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:Methodology.drawio.png&amp;diff=16773</id>
		<title>File:Methodology.drawio.png</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=File:Methodology.drawio.png&amp;diff=16773"/>
		<updated>2021-10-20T23:40:42Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Figure X. ECG classification methodology&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16765</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16765"/>
		<updated>2021-10-20T16:25:49Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B&amp;gt;S.H. Jambukia, V.K. Dabhi, H.B. Prajapati, Classification of ECG Signals using Machine Learning Techniques: a Survey, IEEE, 2015; [Online], Available: https://ieeexplore.ieee.org/abstract/document/7164783&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are also a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;.&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
...&lt;br /&gt;
see figure 2.13 when we add it&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
|&amp;#039;&amp;#039;Figure 2.13: Example LSTM structure.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
For SVM techniques, we use two sets of time-series features including Heartrate Variability (HRV) and heartbeat morphological features, and Symbolic Dynamics (SD) of a recordings with raw, wavelet denoising, and MoV transformed dataset.&lt;br /&gt;
According to &amp;lt;ref name=SK_R&amp;gt;F. Andreotti and et al, Comparing Feature-Based Classifiers and Convolutional Neural Networks to Detect Arrhythmia from Short Segments of ECG, in IEEE Access, 2017; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8331748&amp;lt;/ref&amp;gt; HRV and morphological features of heartbeats worked well with Decision Tree (DT) classifier in AF detection task. Hence, we will experiment these features with SVM algorithm.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Features in HVR and heartbeat morphology&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Type !! Features !! Number !&lt;br /&gt;
|-&lt;br /&gt;
| Time Domain || SDNN, RMSSD, NNx || 8&lt;br /&gt;
|-&lt;br /&gt;
| Frequency Domain || LF power, HF power, LF/HF || 8&lt;br /&gt;
|-&lt;br /&gt;
| Non-linear Features || SampEn, ApEn, Poincaré plot, Recurrence Quantification Analysis || 95&lt;br /&gt;
|-&lt;br /&gt;
| Signal Quality || bSQI, iSQI, kSQI, rSQI || 36&lt;br /&gt;
|-&lt;br /&gt;
| Morphological Features || P-wave power, T-wave power, QT interval|| 22&lt;br /&gt;
|-&lt;br /&gt;
|  || Total || 169 &lt;br /&gt;
|}&lt;br /&gt;
==== Recurrent Neural Network with Long Short-Term Memory ====&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
=== Comparing Results ===&lt;br /&gt;
In order to understand how different methods compare to one another, a parameter which could give a good representation of how well each classifier performed was needed. We decided to use the F1-score for this purpose...&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. From our findings, the F1-score of each is summarised in Table X below, and these are also expressed in Figure X. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.469&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.485&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.483&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || &lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16764</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16764"/>
		<updated>2021-10-20T14:23:38Z</updated>

		<summary type="html">&lt;p&gt;A1798520: /* Pre-Processing Techniques */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref name=PhysioNet&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref name=HeartFoundation&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project Team ===&lt;br /&gt;
==== Project Students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
=== Project Aim ===&lt;br /&gt;
The aim of this project was to investigate whether machine learning can be used to teach a computer to accurately distinguish between normal and abnormal heart patters, and even between different heart diseases.&lt;br /&gt;
&lt;br /&gt;
== Background and Relevant Work ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
[[File:ECG_waveform.gif|thumb|right|&amp;#039;&amp;#039;Figure 2.1: ECG Signal Waves and Intervals.&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;&amp;#039;&amp;#039;]]&lt;br /&gt;
Electrocardiograms (ECGs) represent the electrical activity of the heart with respect to time. In the human body, the contraction of muscles is associated with changes in the membrane potential (i.e. depolarisation) of cells&amp;lt;ref&amp;gt;P.S. Addison, Wavelet Transforms and the ECG: a Review,  in Physiological Measurement, vol. 26, 2005; [Online], Available: https://iopscience.iop.org/article/10.1088/0967-3334/26/5/R01/pdf  &amp;lt;/ref&amp;gt;. In this way, ECGs can be acquired by placing electrodes on the body (either on the torso or the limbs), and measuring the potential difference between these. The important features in a single cycle of an ECG are shown in Figure 2.1. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks, are of interest since any irregularity or absence in any of these features could indicate an abnormality. The P-wave corresponds to the contraction of the two smaller chambers of the heart (the atria), whereas the QRS complex corresponds to the contraction of the two larger chambers (the ventricles). The contraction of the ventricles pushes blood out of the heart and around the body. The T-wave represents the repolarisation of the ventricles, although the repolarisation of the atria is not visible as it coincides with the QRS complex. The RR interval represents the length of time between subsequent heart beats, so can quickly identify whether a patients&amp;#039; heart is beating in a regular rhythm. ECG acquisition was beyond the scope of this project. Instead, all data was collected from the PhysioNet Database&amp;lt;ref name=PhysioNet/&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Classifying ECGs is a challenging process for a number of reasons. For example, normal ECGs differ between patients, one disease may have dissimilar signs on different patients, and two distinct diseases may have a similar effect on a normal ECG&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. Furthermore, electrodes pick up not only activity of the heart, but other muscular contractions. As such artefacts (for example from motion or breathing), as well as noise, are often overlaid on the ECG as well. In this way, pre-processing and machine learning classification of ECGs may be able to diagnose patients more precisely than manual classification.&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
[[File:CVD-the-facts.png|thumb|right|&amp;#039;&amp;#039;Figure 2.2: Cardiovascular Disease Statistics&amp;#039;&amp;#039;&amp;lt;ref name=HeartFoundation/&amp;gt;]]&lt;br /&gt;
Cardiovascular disease (CVD) is a term that includes heart, stroke, and other blood vessel diseases. It is among Australia&amp;#039;s largest health problems, and accounts for around one in four of all deaths. Most CVD risk factors are able to be prevented through a healthy lifestyle&amp;lt;ref name=HeartFoundation/&amp;gt;, so it is important that CVDs are identified as early and accurately as possible.&lt;br /&gt;
&lt;br /&gt;
CVD can come in many forms, although for this project just one has been focussed on, that being atrial fibrillation (AF). AF is an abnormal heart condition in which the regular atrial activity is instead replaced with fast and disorderly tremor waves&amp;lt;ref name=SK_AA&amp;gt;Y. Hu, Y. Zhao, J. Liu, J. Pang, C. Zhang, P. Li, An Effective Frequency-Domain Feature of Atrial Fibrillation Based on Time-Frequency Analysis, in BMC Medical Informatics and Decision Making, vol. 20, 2020; [Online], Available: https://link.springer.com/article/10.1186/s12911-020-01337-1&amp;lt;/ref&amp;gt;. On the ECG, this means the P-waves often disappear, and the RR interval has a variable duration. The incidence of AF increases with age, and is characterised by palpitations, shortness of breath and chest pain.&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: pre-processing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage. Deep learning analysis of ECG waveforms is by no means a new field of work, so this project was able to examine the methods and results produced by a number of previous studies. This section also quickly discusses the processes found in the literature.&lt;br /&gt;
&lt;br /&gt;
==== Pre-processing ====&lt;br /&gt;
Prior to analysing the signal, it is often useful to complete some pre-processing to remove undesirable features including noise, baseline wander, motion artefacts and other interruptions. It is almost always useful to remove noise, and this can be completed with ordinary high-pass, low-pass and band-pass filters, or with wavelet denoising. For example, Wang et al.&amp;lt;ref name=SK_X/&amp;gt; used a number of different filters to pre-process ECG recordings. They used a 50Hz notch filter to remove powerline interference, a 30Hz low-pass filter to remove high frequency noise, and a 0.1Hz high-pass filter to remove low-frequency noise and artefacts (such as breathing artefacts). Similarly Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; used a bandpass filter with cut-off frequencies at 0.5Hz and 30Hz, for the same reasons.&lt;br /&gt;
&lt;br /&gt;
Wavelet denoising works in quite a different manner. Instead, wavelet decomposition is used on the signal, and a certain threshold is used to concentrate the signal over only a few wavelet coefficients&amp;lt;ref name=SK_L&amp;gt;O. Faust, U.R. Acharya, H. Adeli, A. Adeli; 2015, Wavelet-Based EEG Processing for Computer-Aided Seizure Detection and Epilepsy Diagnosis, in Seizure, vol. 26, 2015, pp 56-64; [Online], Available: https://www.sciencedirect.com/science/article/pii/S1059131115000138&amp;lt;/ref&amp;gt;. Wavelet denoising can have the advantage over traditional filtering as particular types of wavelets are similar in shape to the ECG features. Another advantage of using wavelets, is that the wavelet transform gives a time-variant decomposition, making it possible to choose different filtering settings for different time windows.&lt;br /&gt;
&lt;br /&gt;
Other pre-processing steps can also be applied. For example, the ECG could be transformed using the Moment of Velocity (MoV)&amp;lt;ref name=MoV&amp;gt;Insert Reference!!&amp;lt;/ref&amp;gt;. The MoV of a signal is similar to its instantaneous frequency, however it is more robust to noise and can suppress large spikes caused by sudden changes. Hence, it is able to provide spectral information in a more convenient way.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
In general, machine learning works by classifying data based on a number of features in that data. It is generally quicker and more accurate to do this based on a small set of features instead of the raw data, hence it is important to extract an appropriate set of features. For example, if we were classifying different types of fruit we might choose features such as colour and shape to distinguish between the different types. The features required to classify ECG signals are more abstract, but the principle is the same. Features which are consistent within a class, but which vary between classes are desirable.&lt;br /&gt;
&lt;br /&gt;
Features can come from the time domain, frequency domain, or even the time-frequency domain. In the time domain, features can include the detection of R-peaks and hence RR-intervals, the shape of the QRS complex, or the duration of the P-wave and various intervals. Often variation within a given ECG, particularly variation of the RR-interval, is indicative of an abnormality&amp;lt;ref name=SK_AA/&amp;gt;. One method of extracting the QRS complex discussed in the literature was a process called the Pan-Tompkins algorithm&amp;lt;ref name=SK_B&amp;gt;S.H. Jambukia, V.K. Dabhi, H.B. Prajapati, Classification of ECG Signals using Machine Learning Techniques: a Survey, IEEE, 2015; [Online], Available: https://ieeexplore.ieee.org/abstract/document/7164783&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_N&amp;gt;Y. Palaniappan, V.A. Vishanth, N. Santhosh, R. Karthika, M. Ganesan; 2020, R-Peak Detection Using Altered Pan-Tompkins Algorithm, IEEE, 2020; [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9182298&amp;lt;/ref&amp;gt;. This process involves filtering and differentiating the ECG to remove noise and low-frequency components, squaring the signal to enhance high-frequency components, and finally using a moving-window integrator to extract the slope of the R-waves. Each stage of this algorithm is shown below in Figure 2.4, and the result overlaid on an ECG is shown in Figure 2.5.&lt;br /&gt;
&lt;br /&gt;
Conversely, features can come from the frequency domain. The main features in an ECG signal are contained within a frequency range of about 0.5-30 Hz &amp;lt;ref name=SK_AA/&amp;gt;&amp;lt;ref name=SK_X&amp;gt;J. Wang, P. Wang, S. Wang, Automated Detection of Atrial Fibrillation in ECG Signals Based on Wavelet Packet Transform and Correlation Function of Random Process, in Biomedical Signal Processing and Control, vol. 55, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1746809419302435&amp;lt;/ref&amp;gt;, with components outside this range largely corresponding to noise. Hu et al.&amp;lt;ref name=SK_AA/&amp;gt; demonstrated that the frequency component with the maximum amplitude may be important to identify. In normal signals, this is around 1Hz, but is more volatile in patients with AF, where it can range from 2 to 8 Hz. However, ECG signals are non-stationary data, meaning their properties can&amp;#039;t be fully described with frequency domain information. This is where time-frequency features come in.&lt;br /&gt;
&lt;br /&gt;
Time-frequency features demonstrate how the frequency content of a non-stationary signal varies with time. One such tool for time-frequency analysis is a scalogram. The scalogram is displayed as an image, which can be used for classification by a CNN. Figure 2.6 shows a scalogram for a normal ECG pattern, and Figure 2.7 shows a scalogram for a patient with AF. Another time-frequency feature extraction technique which can be used is that of wavelet decomposition. Similar to decomposing a signal into a sum of sinusoids in Fourier analysis in the frequency domain, wavelet decomposition decomposes the signal into a sum of wavelets &amp;lt;ref name=SK_FA&amp;gt;N. Emanet, ECG Beat Classification by Using Discrete Wavelet Transform and Random Forest Algorithm, IEEE, 2009, [Online]. DOI: 10.1109/ICSCCW.2009.5379457&amp;lt;/ref&amp;gt;. The idea of wavelet decomposition is to reduce a large signal (for example 9000 samples long) to a shorter set of features (e.g. 190). This can significantly decrease computational time while increasing performance. A comparison of the ECG, wavelet denoised ECG and the MoV is shown in Figure 2.9.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
|&amp;#039;&amp;#039;Figure 2.3: Frequency Spectrum of an ECG.&amp;#039;&amp;#039;&lt;br /&gt;
File:Stages of Pan Tompkins algorithm.png|&amp;#039;&amp;#039;Figure 2.4: Stages of the Pan-Tompkins Algorithm&amp;#039;&amp;#039;&lt;br /&gt;
File:Pan Tompkins result.png|&amp;#039;&amp;#039;Figure 2.5: Comparison of ECG and extracted QRS using Pan-Tompkins&amp;#039;&amp;#039;&lt;br /&gt;
File:N 150.jpg|&amp;#039;&amp;#039;Figure 2.6: Scalogram of Normal ECG&amp;#039;&amp;#039;&lt;br /&gt;
File:A 44.jpg|&amp;#039;&amp;#039;Figure 2.7: Scalogram of ECG with AF&amp;#039;&amp;#039;&lt;br /&gt;
File:Wavelet decomposition of ECG.png|&amp;#039;&amp;#039;Figure 2.8: Wavelet Decomposition of an ECG&amp;#039;&amp;#039;&lt;br /&gt;
File:ECG wavelet denoise and mov.png|&amp;#039;&amp;#039;Figure 2.9: ECG Compared with Wavelet Denoised ECG and MoV.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Classification and Validation ====&lt;br /&gt;
ECG classification is a multi-class classification problem&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;, including classes such as normal and abnormal, and possibly even with separating the abnormal class down into specific conditions. Classification can be completed using many different methods. In this project, the classification step has made use of a number of machine learning (ML) techniques. ML is an application of artificial intelligence in which algorithms parse data, learn which feature correspond to which class, and then apply this to make an informed decision on new data.&lt;br /&gt;
&lt;br /&gt;
In order to train the machine, the data is split into a &amp;quot;training set&amp;quot; and a &amp;quot;test set&amp;quot;. First, the training set and its correct labels are given to the machine to teach it how to identify each class in the data. Depending on the ML, this may make clusters of each class, or assign weights to a neural network, for example. Next, the ML is used to classify the test set of data. The effectiveness of the method is then validated by comparing the assigned classes to the actually classes for all the data in the test set.&lt;br /&gt;
&lt;br /&gt;
A number of ML algorithms are of interest, including the support vector machine (SVM), convolutional neural network (CNN) and recurrent neural network with long-short term memory (LSTM). Each of these are described briefly following.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Support Vector Machine&amp;#039;&amp;#039;&lt;br /&gt;
[[File:SVM example.JPG|thumb|right|upright=0.75|&amp;#039;&amp;#039;Figure 2.10: Example 2D SVM with new data point in green.&amp;#039;&amp;#039;]]An SVM is a supervised machine learning algorithm which can be used to classify data based on the value of a number of features. Each signal in the training set is plotted in n-dimensional space (where &amp;#039;n&amp;#039; is the number of features), then a line (or hyperplane in higher-order space) is drawn between the clusters of each category to best separate the data. The signals in the test set of data are then plotted in the same n-dimensional space, and are assigned a class based on the location in which it falls. Figure 2.10 shows a simple 2-dimensional example with class 1 in red and class 2 in blue. If a new data point, such as the green dot in Figure 2.10, is introduced, the SVM will classify this as a Class 2, given the side of the line it falls on.&lt;br /&gt;
&lt;br /&gt;
Many previous studies have made use of an SVM to classify ECG data&amp;lt;ref name=SK_V&amp;gt;H. Li, et al., Arrhythmia Classification Algorithm Based on Multi-Feature and Multi-Type Optimised SVM, in the American Scientific Research Journal for Engineering, Technology and Sciences (ASRJETS), vol. 63, No 1, 2020, pp 72-86; [Online]. Available: https://asrjetsjournal.org/index.php/American_Scientific_Journal/article/view/5509/2046&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Z&amp;gt;Y. Zhang, S. Wei, L. Zhang, C. Liu, Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features, in Journal of Medical and Biological Engineering, vol. 39, 2019, pp 381-392. [Online], Available: https://link.springer.com/article/10.1007/s40846-018-0411-0&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_Q&amp;gt;C. Venkatesan, et al.; ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications; IEEE, 2018; Accessed 20 March 2021; [Online] DOI: 10.1109/ACCESS.2018.2794346&amp;lt;/ref&amp;gt;. Venkatesan et al.&amp;lt;ref name=SK_Q/&amp;gt; achieved a 96% accuracy for sorting normal and abnormal ECG signals based on a range of time- and frequency-domain features. Zhang et al.&amp;lt;ref name=SK_Z/&amp;gt; tested a range of SVMs, and found a least-squares SVM to be more effective than the others, achieving an accuracy of over 92%. Li et al.&amp;lt;ref name=SK_V/&amp;gt; extended the idea of SVM classification by experimenting with ways in which it could be optimised. Among others, they found particle swarm algorithms and genetic algorithms to be effective, achieving an accuracy of over 95% in each case.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Convolutional Neural Network&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
An artificial neural network (ANN) is capable of extracting complex and non-linear sets of features from a set of data. They are constructed to simulate neurons in the biological nervous system, and so are composed of many interconnected units linked with various weighting factors. The weight of each determines its contribution and can be adjusted through training. The general structure of an ANN is shown in Figure 2.11.&lt;br /&gt;
&lt;br /&gt;
Building on from ANNs, CNNs add processing stages to the input of the neural network. The convolution layers extract features from the input data, and the pooling layers reduce the size of these features, which decreases the computational power of data classification. Finally, a fully-connected layer is used to classify the data, and this is usually a regular ANN. CNNs are particularly useful for classifying images, for example hand-written numbers as in the diagram in Figure 2.12.&lt;br /&gt;
&lt;br /&gt;
CNNs are also a well-tested means of classifying ECG signals&amp;lt;ref name=SK_R&amp;gt;J. Huang, B. Chen, B. Yao, W. He, ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Networks, in IEEE Access, vol. 7, 2019; [Online]. Available: https://ieeexplore.ieee.org/document/8759878&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_S&amp;gt;M. Rashed-Al-Mahfuz, M.A. Moni, P. Lio, S.M.S. Islam, S. Berkovsky, M. Khushi, J.M.W. Quinn, Deep Convolutional Neural Networks Based ECG Beats Classification to Diagnose Cardiovascular Conditions, in Biomedical Engineering Letters, vol 11, 2021, pp 147-162; [Online], Available: https://link.springer.com/article/10.1007/s13534-021-00185-w&amp;lt;/ref&amp;gt;&amp;lt;ref name=SK_W&amp;gt;O.S. Lih, et al., Comprehensive Electrocardiographic Diagnosis Based on Deep Learning, in Artificial Intelligence in Medicine, vol. 103, 2020; [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0933365719309030&amp;lt;/ref&amp;gt;.&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;Long-Short Term Memory&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
...&lt;br /&gt;
see figure 2.13 when we add it&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=200px&amp;gt;&lt;br /&gt;
File:ANN example.png|&amp;#039;&amp;#039;Figure 2.11: Example ANN structure.&amp;#039;&amp;#039;&amp;lt;ref name=SK_G&amp;gt;L. Chang, Z. Zhang, L. Ye, D. Friedrich, Synergistic Effects of Nanoparticles and Traditional Tribofillers on Sliding Wear of Polymeric Hybrid Composites, in Tribology of Polymeric Nanocomposites, 2nd ed., 2013, pp 49-89; [Online], Available: https://www.sciencedirect.com/science/article/pii/B9780444594556000039&amp;lt;/ref&amp;gt;&lt;br /&gt;
File:CNN example.jpg|&amp;#039;&amp;#039;Figure 2.12: Example CNN structure, for identifying hand-written numbers.&amp;#039;&amp;#039;&amp;lt;ref name=SK_H&amp;gt;S. Saha, A Comprehensive Guide to Convolutional Neural Networks – the ELI5 Way, 16 Dec 2018, Accessed: 24 May 2021, [Online], Available: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53&amp;lt;/ref&amp;gt;&lt;br /&gt;
|&amp;#039;&amp;#039;Figure 2.13: Example LSTM structure.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Method ==&lt;br /&gt;
In completing this project, we investigated the effect of a range of different pre-processing techniques and classification algorithms on classifying the same set of data. &lt;br /&gt;
&lt;br /&gt;
=== Preliminary Work: Manual Analysis of ECG ===&lt;br /&gt;
As a first step in analysing different classes of ECG waveforms, we analysed a few signals to identify the relevant waves and segments in the signal.&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Healthy (Normal) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
Figure 3.1 shows an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Abnormal (Atrial Fibrillation) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.2 is an example of an ECG waveform in which the patient has AF. In the ECG, AF is usually characterised by abnormal or missing P-waves, and variable RR intervals. This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present, all of which are consistent with the usual signs of AF.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Other Abnormal (Other) ECG&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The waveform in Figure 3.3 is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery mode=packed heights=300px&amp;gt;&lt;br /&gt;
File:Normal ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.1: Relevant features of a normal ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
File:AF ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.2: ECG waveform of patient with AF.&amp;#039;&amp;#039;&lt;br /&gt;
File:Other ECG Annotated Waveform.png|&amp;#039;&amp;#039;Figure 3.3: Other heart abnormality ECG waveform.&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
=== Pre-Processing Techniques ===&lt;br /&gt;
Since most of the previous ECG classification projects use traditional Fourier Transform (FT) based filters to denoising a signal, we will investigate the effects of two other filtering methods discussed on the literature on the ECGs. Wavelet denoising and Moment of Velocity will be applied to the same dataset, then the raw dataset and its cleaned version will be fed into classifiers to measure the importance of pre-processing process. &lt;br /&gt;
==== Wavelet Denoising ====&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Moment of Velocity ====&lt;br /&gt;
&lt;br /&gt;
=== Classification Models ===&lt;br /&gt;
Based on the results found in the literature, we decided to analyse a number of classification models.&lt;br /&gt;
==== Support Vector Machine ====&lt;br /&gt;
&lt;br /&gt;
==== Recurrent Neural Network with Long Short-Term Memory ====&lt;br /&gt;
&lt;br /&gt;
==== Convolutional Neural Network ====&lt;br /&gt;
&lt;br /&gt;
=== Comparing Results ===&lt;br /&gt;
In order to understand how different methods compare to one another, a parameter which could give a good representation of how well each classifier performed was needed. We decided to use the F1-score for this purpose...&lt;br /&gt;
&lt;br /&gt;
== Results ==&lt;br /&gt;
We tested most combinations of pre-processing and classification techniques mentioned above. From our findings, the F1-score of each is summarised in Table X below, and these are also expressed in Figure X. These results demonstrate that in general the CNN outperformed the other classification methods, although the LSTM was not far behind. In all cases the wavelet denoising was the most effective pre-processing technique.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Table X: Summary of Results&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{| class=&amp;quot;wikitable sortable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Classification Method !! Pre-processing Stages !! Features Extracted !! F1-score&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Raw ECG data ||  || 0.469&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising ||  || 0.485&lt;br /&gt;
|-&lt;br /&gt;
| SVM || Wavelet Denoising and Moment of Velocity ||  || 0.483&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Raw ECG data || Spectrogram || 0.771&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising || Spectrogram || 0.848&lt;br /&gt;
|-&lt;br /&gt;
| CNN || Wavelet Denoising and Moment of Velocity || Spectrogram || 0.816&lt;br /&gt;
|-&lt;br /&gt;
| LSTM|| Raw ECG Data || None - computed on raw ECG data || &lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Raw ECG data || Instantaneous frequency, Entropy || 0.686&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising || Instantaneous frequency, Entropy || 0.817&lt;br /&gt;
|-&lt;br /&gt;
| LSTM || Wavelet Denoising and Moment of Velocity || Instantaneous frequency, Entropy || 0.657&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[File:F1 Scores of Results.png|700px|thumb|center|&amp;#039;&amp;#039;Figure X: Comparison of Results for each Technique.&amp;#039;&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
Our results, ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Future work could be done to improve classification performance. This could be done by finding a different classifier which is better suited to ECG identification, or &lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16182</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16182"/>
		<updated>2021-04-28T08:35:27Z</updated>

		<summary type="html">&lt;p&gt;A1798520: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project team ===&lt;br /&gt;
==== Project students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
== Background ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
Electrocardiograms (ECGs) are collected by measuring the electrical activity of the heart by placing electrodes on the skin. The waveforms produced can be used to identify the presence of cardiac abnormalities. &lt;br /&gt;
The most important features in a single cycle of an ECG are shown in the figure below&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks are of interest, since any irregularity or absence in any of these features could indicate an abnormality.&lt;br /&gt;
&lt;br /&gt;
[[https://projectswiki.eleceng.adelaide.edu.au/projects/index.php/File:ECG_waveform.gif|thumb|center]]&lt;br /&gt;
&lt;br /&gt;
(ECG image will be added when I figure out how to do so...)&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: preprocessing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage.&lt;br /&gt;
&lt;br /&gt;
==== Preprocessing ====&lt;br /&gt;
The ECG signal must first be filtered out of the noise for more precise classification. This preprocessing stage avoids the overlap between ECG and motion artifacts as well as high-frequency disturbances. Low-pass, high-pass, Butterworth band-pass filters and Wavelet denoising are used to eliminate the mentioned noise&amp;lt;ref&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
QRS complex, P and T waveforms are identified prior to extract some necessary features to input to machine learning classification. Different signal features such as inter-beat mean value, standard deviation, heart rate variability, … are calculated using beat-to-beat intervals time series between the mentioned waves&amp;lt;ref&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;. The feature selection process ordinarily is designed to provide a means for choosing the features which are best for optimised classification &amp;lt;ref&amp;gt;S. Celin, K. Vasanth, ECG Signal Classification Using Various Machine Learning Techniques, J Med Syst 42, 241 (2018), Available at: https://doi.org/10.1007/s10916-018-1083-6&amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Classification ====&lt;br /&gt;
&lt;br /&gt;
==== Validation ====&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
Cardiovascular disease (CVD) is ...&lt;br /&gt;
&lt;br /&gt;
== Processing and Classification Techniques ==&lt;br /&gt;
This section will describe the techniques mentioned in the previous section, and how ...&lt;br /&gt;
=== Wavelet Denoising ===&lt;br /&gt;
&lt;br /&gt;
=== Support Vector Machine ===&lt;br /&gt;
&lt;br /&gt;
=== Artificial Neural Networks ===&lt;br /&gt;
&lt;br /&gt;
=== Convolutional Neural Networks ===&lt;br /&gt;
&lt;br /&gt;
== Preliminary Work ==&lt;br /&gt;
So far, we have been able to analyse a few existing ECG classification tools and methods. These will be analysed briefly here.&lt;br /&gt;
=== Manual Analysis of ECGs ===&lt;br /&gt;
Records of ECGs were downloaded from physionet.org&amp;lt;ref&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; and plotted using MATLAB. We analysed these ourselves, and described the difference between the three different types of ECGs provided in the data.&lt;br /&gt;
&lt;br /&gt;
==== Healthy (Normal) ECG ====&lt;br /&gt;
Below is an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
[[File:Normal ECG Annotated Waveform.png|thumb|center|Relevant features of a normal ECG waveform.]]&lt;br /&gt;
&lt;br /&gt;
==== Abnormal (Atrial Fibrillation) ECG ====&lt;br /&gt;
The waveform below is an example of an ECG waveform in which the patient has a heart condition known as atrial fibrillation (AF). This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present. AF is usually characterised by ...&lt;br /&gt;
[[File:AF ECG Annotated Waveform.png|thumb|center|ECG waveform of patient with AF]]&lt;br /&gt;
&lt;br /&gt;
==== Abnormal (Other) ECG ====&lt;br /&gt;
The waveform below is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
[[File:Other ECG Annotated Waveform.png|thumb|center|Other heart abnormality ECG waveform.]]&lt;br /&gt;
&lt;br /&gt;
=== PQRSTdetection Algorithm ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
(Conclusions will be added as the project draws to a close).&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16181</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16181"/>
		<updated>2021-04-28T06:32:54Z</updated>

		<summary type="html">&lt;p&gt;A1798520: /* Preprocessing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project team ===&lt;br /&gt;
==== Project students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
== Background ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
Electrocardiograms (ECGs) are collected by measuring the electrical activity of the heart by placing electrodes on the skin. The waveforms produced can be used to identify the presence of cardiac abnormalities. &lt;br /&gt;
The most important features in a single cycle of an ECG are shown in the figure below&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks are of interest, since any irregularity or absence in any of these features could indicate an abnormality.&lt;br /&gt;
&lt;br /&gt;
[[https://projectswiki.eleceng.adelaide.edu.au/projects/index.php/File:ECG_waveform.gif|thumb|center]]&lt;br /&gt;
&lt;br /&gt;
(ECG image will be added when I figure out how to do so...)&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: preprocessing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage.&lt;br /&gt;
&lt;br /&gt;
==== Preprocessing ====&lt;br /&gt;
The ECG signal must first be filtered out of the noise for more precise classification. This preprocessing stage avoids the overlap between ECG and motion artifacts as well as high-frequency disturbances. Low-pass, high-pass, Butterworth band-pass filters and Wavelet denoising are used to eliminate the mentioned noise[5].&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
&lt;br /&gt;
==== Classification ====&lt;br /&gt;
&lt;br /&gt;
==== Validation ====&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
Cardiovascular disease (CVD) is ...&lt;br /&gt;
&lt;br /&gt;
== Processing and Classification Techniques ==&lt;br /&gt;
This section will describe the techniques mentioned in the previous section, and how ...&lt;br /&gt;
=== Wavelet Denoising ===&lt;br /&gt;
&lt;br /&gt;
=== Support Vector Machine ===&lt;br /&gt;
&lt;br /&gt;
=== Artificial Neural Networks ===&lt;br /&gt;
&lt;br /&gt;
=== Convolutional Neural Networks ===&lt;br /&gt;
&lt;br /&gt;
== Preliminary Work ==&lt;br /&gt;
So far, we have been able to analyse a few existing ECG classification tools and methods. These will be analysed briefly here.&lt;br /&gt;
=== Manual Analysis of ECGs ===&lt;br /&gt;
Records of ECGs were downloaded from physionet.org&amp;lt;ref&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; and plotted using MATLAB. We analysed these ourselves, and described the difference between the three different types of ECGs provided in the data.&lt;br /&gt;
&lt;br /&gt;
==== Healthy (Normal) ECG ====&lt;br /&gt;
Below is an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
[[File:Normal ECG Annotated Waveform.png|thumb|center|Relevant features of a normal ECG waveform.]]&lt;br /&gt;
&lt;br /&gt;
==== Abnormal (Atrial Fibrillation) ECG ====&lt;br /&gt;
The waveform below is an example of an ECG waveform in which the patient has a heart condition known as atrial fibrillation (AF). This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present. AF is usually characterised by ...&lt;br /&gt;
[[File:AF ECG Annotated Waveform.png|thumb|center|ECG waveform of patient with AF]]&lt;br /&gt;
&lt;br /&gt;
==== Abnormal (Other) ECG ====&lt;br /&gt;
The waveform below is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
[[File:Other ECG Annotated Waveform.png|thumb|center|Other heart abnormality ECG waveform.]]&lt;br /&gt;
&lt;br /&gt;
=== PQRSTdetection Algorithm ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
(Conclusions will be added as the project draws to a close).&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
	<entry>
		<id>https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16180</id>
		<title>Projects:2021s1-13434 Can we teach a machine to be a cardiologist?</title>
		<link rel="alternate" type="text/html" href="https://projectswiki.eleceng.adelaide.edu.au/projects/index.php?title=Projects:2021s1-13434_Can_we_teach_a_machine_to_be_a_cardiologist%3F&amp;diff=16180"/>
		<updated>2021-04-28T06:28:14Z</updated>

		<summary type="html">&lt;p&gt;A1798520: /* Preprocessing */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Projects]]&lt;br /&gt;
[[Category:Final Year Projects]]&lt;br /&gt;
[[Category:2021s1|13434]]&lt;br /&gt;
Electrocardiograms (ECGs) are an important biological signal. They are a measurement of the electrical activity of the heart and can be used to diagnose a number of cardiovascular diseases (CVD). Machine learning (ML) techniques can be used to identify the important features of an ECG and then classify these into normal and abnormal groups. So far, ECG recordings have been collected from the PhysioNet&amp;lt;ref&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; database, and have been analysed by hand and using existing ML techniques &amp;lt;ref&amp;gt;PQRSTdetection, MathWorks, Available: https://au.mathworks.com/matlabcentral/fileexchange/66098-ecg-p-qrs-t-wave-detecting-matlab-code&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;MathWorks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
As engineers, we frequently work with a range of signals and signal processing techniques. These signals may come from anywhere, including electrical equipment, or the human body. Often signals collected from the human body are used to measure or verify a patient&amp;#039;s health. One example of a biological signal which is of interest are electrocardiograms (ECGs). These signals are collected by placing electrodes on the skin around the heart, which record the electrical activity of the heart. Any abnormalities in the signal may be an indication of a cardiovascular disease (CVD). CVD affects around 1/6 Australians and contributes to 26% of deaths&amp;lt;ref&amp;gt;Heart Foundation, Available: https://www.heartfoundation.org.au/activities-finding-or-opinion/key-stats-cardiovascular-disease &amp;lt;/ref&amp;gt;, so the early detection and treatment of these are critical.&lt;br /&gt;
&lt;br /&gt;
There has been a recent interest in using machine learning (ML) techniques to identify features of, and then classify, ECG signals. ML techniques could make it possible to diagnose patient more precisely than when done manually&amp;lt;ref&amp;gt;S. H. Jambukia, V. K. Dabhi, H. B. Prajapati; Classification of ECG signals using machine learning techniques: A survey; IEEE, 2015; Accessed: 16 March 2021; [Online] DOI: 10.1109/ICACEA.2015.7164783&amp;lt;/ref&amp;gt;. In this project, we will explore various methods of classifying ECGs in this way, and look for ways to improve the accuracy of the process.&lt;br /&gt;
&lt;br /&gt;
=== Project team ===&lt;br /&gt;
==== Project students ====&lt;br /&gt;
* Sonia Kleinig&lt;br /&gt;
* Hien Long Nguyen&lt;br /&gt;
==== Supervisors ====&lt;br /&gt;
* Derek Abbott&lt;br /&gt;
* Mohsen Dorraki&lt;br /&gt;
&lt;br /&gt;
== Background ==&lt;br /&gt;
=== Electrocardiograms ===&lt;br /&gt;
Electrocardiograms (ECGs) are collected by measuring the electrical activity of the heart by placing electrodes on the skin. The waveforms produced can be used to identify the presence of cardiac abnormalities. &lt;br /&gt;
The most important features in a single cycle of an ECG are shown in the figure below&amp;lt;ref&amp;gt;ResearchGate, ECG Schematic, Available: https://www.researchgate.net/figure/Schematic-representation-of-normal-ECG-waveform_fig3_287200946&amp;lt;/ref&amp;gt;. In particular the P wave, T wave and QRS complex, as well as time between subsequent R peaks are of interest, since any irregularity or absence in any of these features could indicate an abnormality.&lt;br /&gt;
&lt;br /&gt;
[[https://projectswiki.eleceng.adelaide.edu.au/projects/index.php/File:ECG_waveform.gif|thumb|center]]&lt;br /&gt;
&lt;br /&gt;
(ECG image will be added when I figure out how to do so...)&lt;br /&gt;
&lt;br /&gt;
=== ECG Analysis Steps ===&lt;br /&gt;
The steps required to analyse and classify ECG waveforms include the following four steps: preprocessing, feature extraction and selection, classification, and validation. This section will describe what each of these steps entails, and list techniques which can be used at each stage.&lt;br /&gt;
&lt;br /&gt;
==== Preprocessing ====&lt;br /&gt;
The ECG signal must first be filtered out of the noise for more precise classification. This preprocessing stage avoids the overlap between ECG and motion artifacts as well as high-frequency disturbances. Low-pass, high-pass, Butterworth band-pass filters and Wavelet denoising are used to eliminate the mentioned noise.&lt;br /&gt;
&lt;br /&gt;
==== Feature Extraction and Selection ====&lt;br /&gt;
&lt;br /&gt;
==== Classification ====&lt;br /&gt;
&lt;br /&gt;
==== Validation ====&lt;br /&gt;
&lt;br /&gt;
=== Cardiovascular Disease ===&lt;br /&gt;
Cardiovascular disease (CVD) is ...&lt;br /&gt;
&lt;br /&gt;
== Processing and Classification Techniques ==&lt;br /&gt;
This section will describe the techniques mentioned in the previous section, and how ...&lt;br /&gt;
=== Wavelet Denoising ===&lt;br /&gt;
&lt;br /&gt;
=== Support Vector Machine ===&lt;br /&gt;
&lt;br /&gt;
=== Artificial Neural Networks ===&lt;br /&gt;
&lt;br /&gt;
=== Convolutional Neural Networks ===&lt;br /&gt;
&lt;br /&gt;
== Preliminary Work ==&lt;br /&gt;
So far, we have been able to analyse a few existing ECG classification tools and methods. These will be analysed briefly here.&lt;br /&gt;
=== Manual Analysis of ECGs ===&lt;br /&gt;
Records of ECGs were downloaded from physionet.org&amp;lt;ref&amp;gt;PhysioNet, Available: https://physionet.org/content/challenge-2017/1.0.0/&amp;lt;/ref&amp;gt; and plotted using MATLAB. We analysed these ourselves, and described the difference between the three different types of ECGs provided in the data.&lt;br /&gt;
&lt;br /&gt;
==== Healthy (Normal) ECG ====&lt;br /&gt;
Below is an example of a normal, healthy, ECG waveform. Notice that the rhythm (i.e. time between R peaks) is relatively constant, and that all ECG features are clearly noticeable and have the correct locations and magnitudes.&lt;br /&gt;
[[File:Normal ECG Annotated Waveform.png|thumb|center|Relevant features of a normal ECG waveform.]]&lt;br /&gt;
&lt;br /&gt;
==== Abnormal (Atrial Fibrillation) ECG ====&lt;br /&gt;
The waveform below is an example of an ECG waveform in which the patient has a heart condition known as atrial fibrillation (AF). This waveform is abnormal since the R-peak rhythm is inconsistent, the P wave is inconsistent in magnitude, and there are extra waves present. AF is usually characterised by ...&lt;br /&gt;
[[File:AF ECG Annotated Waveform.png|thumb|center|ECG waveform of patient with AF]]&lt;br /&gt;
&lt;br /&gt;
==== Abnormal (Other) ECG ====&lt;br /&gt;
The waveform below is an example of another (unspecified) heart condition. Although the rhythm is consistent, the ECG is missing either the T or P wave, or they overlap.&lt;br /&gt;
[[File:Other ECG Annotated Waveform.png|thumb|center|Other heart abnormality ECG waveform.]]&lt;br /&gt;
&lt;br /&gt;
=== PQRSTdetection Algorithm ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== MATLAB ECG Wavelet Classification ===&lt;br /&gt;
There is an example on Mathworks which demonstrates how to classify ECG signals using wavelet-based feature extraction and an SVM classifier using MATLAB&amp;lt;ref&amp;gt;Mathworks, Available: https://au.mathworks.com/help/wavelet/ug/ecg-classification-using-wavelet-features.html &amp;lt;/ref&amp;gt;. The wavelet feature extraction transform the signals into a smaller set of features, and the SVM is then used to classify the signals based on the features extracted. The data was split into two sets: a training set and a test set. The training set was used to train the machine on how to classify the signals, and the test set was used to measure the accuracy of the machine. Each signal belonged to one of three different categories (arrhythmia, congestive heart failure, and normal sinus rhythm), and the results from the test set produced an accuracy of approximately 98%. We will use this as a baseline to compare to.&lt;br /&gt;
&lt;br /&gt;
== Conclusion and Future Work ==&lt;br /&gt;
(Conclusions will be added as the project draws to a close).&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;/div&gt;</summary>
		<author><name>A1798520</name></author>
		
	</entry>
</feed>