In python write an outlier detection algorithm using anarray of points. To implement the priority queue, usePython's heapq. The distance betweendata points will be implemented using Euclidean distance (theL2-norm)
#import allowed
import numpy as npimport matplotlib.pyplot as pltimport heapq as hqimport time
def get_outliers(data, k=10, num_outliers=5): """ Input: Numpy array data, interger k is the kthnearest neighbor to consider, integer num_outliers is the number of outliers we arelooking for Action: Implement the slower algorithm of findingdistance-based outliers Output: List of outliers (built as priority queue)with num_outliers entries; entries are (key, value) tuples, key is the Euclidean distance between the outlier andits kth nearest neighbor, value is the index of the outlier point in the inputdata """ # Priority Queue for the outliers # Entries of this queue must be tuples of (key,value) # Keys are Euclidean distances # Values are indices of data points outliers = [] # Enter code here ...
return outliers
Pseudocode for algorithm
example data input :
output: when oputliers = 3
[(euclidian distance, index), (euclidian distance,index), (euclidian distance, index)]
1. init min-priority for x₁ EX: 2. 3. init max-priority queue Q 4. for x; # x₂ EX: 5. 6. 7. WNH 567 8. 9. 10. 11. 12. 13. 14. return O coa queue O insert dist (x₁, xj) into Q if IQ> k remove max from Q if |0|==k and 10|==m and max (0) <min (0) discard x₁ ; not an outlier if xi is not discarded: insert x into O with key max (Q) if |0| > m remove point with min key from O
In python write an outlier detection algorithm using an array of points. To implement the priority queue, use Python's h
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am