close
  • 홈
  • :
  • 위치로그
  • :
  • 태그
  • :
  • 방명록
  • :
  • 관리자
  • :
  • 새글쓰기
블로그 이미지

이슬나라 [isulnara.com]
프로그램 관련 문의...
전체 (208)
자작 프로그램 (24)
EzIP (3)
IEPageSetup (3)
iSysInfoX (2)
메신저 알림이 (1)
ezSVC (1)
WebFTP (2)
iDebugX (1)
기타 (10)
버그 신고 (1)
이것저것.. (55)
WebFTP 게시판 (0)
팁 모음 (77)
linux (21)
프로그래밍 (36)
윈도우 (5)
네크워크 (7)
기타 (7)
윈도우 숨은.. (4)
터미널 서비스.. (1)
공개 웹하드 (1)
관리자 (0)
PC 원격제어.. (1)
NAS (43)
«   2012/02   »
일 월 화 수 목 금 토
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29      
location.href bash mantis 원격제어 max_allowed_packet DDNS bluetooth sed nateon 콘솔프로그램 ds107 mysql 시놀로지 날짜비교 ezIP 배치파일 병돌리기 한글 HistoryRecord 블록 Peak Detection MACAddress 맥어드레스 Spin The Bottle jQuery sqlite ezlink 아이피 WDN-2000 아이피변경
[ezLink] 동시 접속수...
ezLink 1.2.1.2 정식... (3)
MD5 CRC 체크섬.
Apache, Subversion...
CentOS에 MongoDB 설치.
예.. 제가 직접 만들어서...
isul / 01/29
직접 만드시는 프로그램...
LuckySh / 01/28
109j용 1869가 synology...
isul / 01/22
Ds-109j 1869 가지고 계...
심재규 / 01/21
시도해보지는 않았지만 S...
isul / 01/20
일반 어플리케이션을 서...
ㅇㅇ/ / 2009
사이코웨어 : nProtect,...
√ MIRiyA's AstraLog / 2008
웹페이지에서 인쇄시 머...
醉生夢死™ / 2006
웹페이지에서 MAC Addres...
날자~!! 날어~!! / 2005
 최근글 목록
 2011/11 [2]
 2011/10 [3]
 2011/09 [1]
 2011/07 [3]
 2011/06 [1]
넷하드 - NAS 카페
무료 원격제어 프로그램
블로그가 뭥미?
솔라리스 테크넷
스티브 맥코넬
시놀로지 NAS 카페
하얀나무 - 캠핑 전문 쇼핑몰
하얀나무's Story
Total of
456354 visitors
Today 61
Yesterday 189
 
     
 팁 모음/프로그래밍 
Peak Detection
Posted on 2009/05/05 18:19
 
 
 
 
    Introduction
Musical instrument signals generally consist of a transient portion and steady state or quasi-periodic portion. The transient part is usually the attack of the signal and the steady state the portion that follows the attack part. When investigating time variant signals it is critical to make use of both time and frequency domain analysis techniques. Some important features in musical signals include duration, amplitude modulation, pitch, spectral harmonicity, spectral envelope, spectral centroid and the like. Attack time is especially considered a salient feature of musical timbre (Eagleson and Eagleson 1947; Saldanha and Corso 1964; Elliot 1975) and has been thought to be a dominant feature of musical instruments. However, it has also been discovered that the attack time and also note-to-note transients of a signal are neither sufficient nor necessary for recognizing musical instruments (Kendall 1986). This controversial discovery supports the importance of the steady state portion of a signal.

 
 
 

This chapter mainly describes the implementation of the signal processing algorithms used in the software system for extracting features that depict these transient and stationary characteristics in the frequency and time domain. The frequency domain analysis section of this chapter is primarily based on the discrete Fourier transform (DFT). DFT based spectral analysis algorithms discussed includes short time Fourier transform, spectral centroid, spectral smoothness andtracking of partials over time. In the time domain analysis section I will mainly describe the implementation of algorithms including pitch detection with interpolation and a period averaging based on the autocorrelation function. Other modules discussed are amplitude envelope, amplitude modulation, attack time computation and noise content analysis.
 
 
 

      Frequency Domain Analysis
        DFT and STFT
The spectral analysis part of feature extraction is primarily based on the discrete Fourier transform (DFT). Below the continuous time and discrete time versions of the Fourier transform are shown.
(2.1)

(2.2)

To extract transitory spectral characteristics the short time Fourier transform (STFT) was used (Allen 1977; Allen and Rabiner 1977). The basic algorithm is as follows.

(2.3)

As seen in figure 2.1 the STFT can be simply described as windowing and taking the FFT of the signal. There are various window types available in the program

Figure 2.1 Short time Fourier transform and Spectral Peak Detection

with different side-lobe and main lobe characteristics. The Hamming window has been shown to work particularly well with musical signals (De Poli, Piccialli and Roads 1991). See the appendix for details regarding windowing and its side-lobe and main lobe characteristics.
 
 
 

        Spectral Peak Detection and Tracking
Pitched musical instruments display a high degree of harmonic spectral quality when analyzed for frequency content. Most tend to have quasi-integer harmonic relationships between spectral peaks and the fundamental frequency. In voice, the spectral envelope displays mountain-like contours or valleys known as formants. The locations of the formants distinctively describe vowels. This is also evident in violins, but the number of valleys is greater and the formant locations change very little with time unlike the voice, which varies substantially for each vowel. Woodwinds such as the bassoon and oboe on the other hand have fewer formants than the voice, but tend to have stronger and clearer spectral contours that perceptually characterize the woodwind family (Cook 1999). Generally, musical instruments like the plucked string (figure 2.2) exhibit lower energy in the high frequency bins. The higher partials normally have less energy and also die out faster than lower ones over time.
Figure 2.2 Plucked string spectrum

Using the short time Fourier transform, I have implemented a spectral peak detection and tracking method, extracting quasi-integer related harmonics from the spectrum. The peak picking algorithm takes into consideration magnitude and frequency information to select the most prominent and harmonically behaving peaks. To help in the search for spectral peaks, various threshold values are used as described below.
 
 
 

The spectral peak detection algorithm is divided into four main steps. The first pass roughly locates possible peaks, where the roughness factor for searching peaks is controlled via a threshold value. The threshold value basically dictates the degree of "peakiness" that is allowed for a local maximum to be considered a possible peak. The second pass filters out peaks that may have been erroneously selected in step 1. The third pass looks for any broken harmonic sequence, analyzing harmonic relationships of the currently selected peaks. In this pass, peaks that may have been deleted or missed in the previous two passes are inserted. The final pass looks at the selected peaks and further does a harmonic analysis ultimately leaving a set of peaks that are most probably harmonics. A mean and scalable standard deviation error method is applied for control of inharmonicity.
 

Figure 2.3 Peak detection algorithm
          Step 1: Rough Peak Detection
In the rough peak detection algorithm possible peaks are picked using negative and positive slope threshold values to guide in the selection process. As shown in figure 2.4 the polarity of the slope of the spectrum is computed from bin to bin (DC to Nyquist) using the basic assumption that a transition from positive to
Figure 2.4 Rough search for peaks
 
negative slope calls for the possibility of a peak. The following conditions help in the selection of a peak:

 
 
 
    The slope must change polarity, positive to negative.
    The magnitude difference between the peak candidate and the current bin's magnitude component (X[k]-X[k+4]) must be greater than a threshold value - see example (figure 2.5).
    A new peak candidate search occurs only after there is a slope change from negative to positive and when a threshold value as shown in figure 2.6 is exceeded.
Refer to flowcharts in the appendix for details.

 
Figure 2.5 Actual peak assessment
Figure 2.6 Transitional peaks (noise)

 
          Step 2: Prominent Peak Search
In step 2, prominent peaks are located from a set of potential peaks found in step 1. The purpose is to filter out local peaks which may be present between stronger partial candidates as shown in figure 2.7. The search for prominent peaks is done in the following way:

 
Figure 2.7 Prominent peak search
    The bin with the maximum magnitude is found.
    Relative to position of the peak with maximum amplitude, peaks are analyzed moving towards DC.
    Relative to position of peak with maximum amplitude, peaks are analyzed moving towards the Nyquist frequency.
Local maxima or peaks are picked out using an adaptive threshold value that is reflective of a prominent peaks (possible partials) and its neighboring peaks as shown in figure 2.7. For example a 50% threshold value will require neighboring peaks to be greater than at least half the magnitude of the prominent peak (possible partial). Refer to the appendix for details on algorithm.

 
 
 
          Step 3: Harmonic Break Search
The third step is called the harmonic break search. Here, I have tried to analyze if some "potential partials" were deleted or missed in the previous steps. This may occur when potentially harmonically related peaks temporarily have little energy or are simply much weaker than the stronger ones, but are nevertheless harmonic. The harmonic break search is divided into the following sub-routines:
    Analyze harmonic relationship between current partial candidates, by computing the mean bin spacing between all prominent peaks.
    (2.4)

    Detecting any harmonic breaks, or discontinuities between prominent peaks.

    If discontinuities are found, going back to step 1 and 2 and do a refined search of possible peaks between pairs of prominent peaks.

Figure 2.8 Harmonic break search

In the harmonic break search's second step, harmonic discontinuities are detected using a pair of threshold values limiting the range of harmonic deviation. Hence, the algorithm expects the possibility of a peak within the threshold bounds computed in sub-step 2 (figure 2.8). Refer to appendix for more details on algorithm.
 
 
 

          Step 4: Harmonicity Analysis
Finally in step 4 an overall harmonicity verification is performed. In this last step, the first few peaks (selectable in software) are used as a guide to determine the final set of partials. The reason for choosing the first few peaks of the spectrum is due to the fact that in highly pitch salient signals, lower harmonics usually are stronger and more stable than higher ones.
The idea is to use the gaussian normal distribution function employing mean, variance and standard deviation for eliminating inharmonic or misbehaving partials. A peak that is outside a right and left threshold bound is considered inharmonic and misbehaving. A mean bin spacing value denoting the bin distances between neighboring peak candidates is computed to render the variance and standard deviation. As the lower partials generally tend to be more stable and have more energy, the first K (K: integer > 0) peaks are used for the computation of the standard deviation. A scaled version of the the standard deviation is then used as a criterion for evaluating inharmonicity of each partial candidate. The scaledstandard deviation is increased or decreased to control the permitted spread of each peak. In other words, the scaled standard deviation is directly relevant to the amount of inharmonicty tolerated for selecting the final set of peaks. The scalar that controls the scaled standard deviation is a value between 0 and 1, where 1 is equivalent to limiting the peaks to the original un-scaled standard deviation. This method is implemented by computing an ideal sequence of harmonics using the above acquired data. Hence the ideal harmonic series is a sequence of partials as shown below.
(2.5)

The ideal set of harmonics and the actual set of harmonics are compared and the error (equation 2.6) for each peak is computed and verified against the scaled standard deviation for final assessment. Peaks that have excessive error values are deleted from the final set of peaks and the remaining ones are finally considered harmonics. See the appendix for more details on algorithm.

(2.6)

Equation 2.6 shows the error between the ideal and actual bins where M is the number of ideal peaks and N is the number of actual peaks in the spectrum. M and N have different values as missing partials may exist in the actual set of peaks.
 
 
 

          Partial Tracking between Frames
Once harmonics have been evaluated in each frame (a frame is equal to the length of the FFT), they are combined to render a spectrogram. Frame to frame partial movement is determined using a harmonic continuity criterion as shown in figure 2.9.
 
Figure 2.9 Partial tracking between frames
The harmonic continuity criterion is explained as follows: Each harmonic in a frame is allowed to sway in frequency within a set of error margin values. Hence, as shown in figure 2.9, four of the harmonics make a continuous harmonic path (k, k+1, k+2, k+3). However, the harmonic in frame k+4 exceeds the allowed error margin and breaks the previous harmonic path. At frame k+4 a new path is created and the path which started at frame k is discontinued. The harmonic continuity criterion is helpful in observing movements of the harmonics over time and frequency.


출처: http://silvertone.princeton.edu/~park/thesis/dartmouth/html/ch2-1.html
Creative Commons License
이 저작물은 크리에이티브 커먼즈 코리아 저작자표시-비영리-변경금지 2.0 대한민국 라이센스에 따라 이용하실 수 있습니다.
이올린에 북마크하기(0) 이올린에 추천하기(0)
Peak Detection
Trackback [0] : Comment [0]
TrackbackAddress
http://isulnara.com/tt/trackback/195
SecretComment
  1 ... 52 53 54 55 56 57 58 59 60 ... 208