SLP Header

VHDL Simulation of DWT for Low Level Image Processing Application

IJCSEC Front Page

Abstract
Discrete wavelet transform (DWT) is the one of the main approaches used for image compression. A new discrete wavelet transform (DWT) architecture is proposed in this paper to realize a memory-efficient 2D DWT unit. The main goal of the proposed system is to change the processing unit in the pipelined DWT. Processing unit is modified in such a way that total number of processing required in the system will be reduced. The hardware is efficiently reduced by using the concept of intra-stage parallelism and inter-stage parallelism. The intra-stage parallelism is obtained by dividing the 2D filtering operation into four tasks. The multi decomposition levels in the stage of pipeline are mapped by computational task in inter-stage parallelism. To maintain the critical path delay serially concatenated additions are optimized by changing computation topology and applying arithmetic optimization. The Proposed architecture computes DWT efficiently with less clock cycles. The hardware complexity of 2D DWT is significantly reduced.
Keywords:discrete wavelet transforms, Computational parallelism, inter-stage parallelism, intra-stage parallelism, multi resolution filtering.
I.Introduction
The multi-resolution decomposition approach is an effective approach to analyze the information present in the content of the image.2D discrete wavelet transform gives the multiresolution decomposition capability [1]. The 2D discrete wavelet transform involves computation of large volumes of data and processing them in various decomposition levels is overhead. Earlier, many studies have been done to improve the performance of 2-D DWT computation which effectively utilizes the hardware resources. The architectures proposed earlier can be broadly classified into separable [2]–[16] and non-separable architectures [17]–[27]. Earlier can be broadly classified into separable [2]–[16] and non-separable architectures [17]–[27].
A separable architecture is one where a 2-D filtering operation is divided into two 1-D filtering operations, one for processing the data row-wise and the other column-wise. A low-storage short-latency separable architecture where the row-wise operations are performed by systolic filters and the column-wise operations are performed in parallel filters has been proposed in [2]. This architecture requires complex control units to facilitate the interleaved operations of the output samples of different decomposition levels by employing a recursive pyramid algorithm (RPA) [3]. A scheme which leads to low complexity architecture with large latency have been proposed by Liao et al. [3] in which each of the row- and column-wise filtering operations are decomposed using the so called lifting operations [29] into a cascade of sub-filtering operations. As the 2D transforms are computed directly by using 2D filters in non-separable architectures, they do not have this problem. Two nonseparable architectures based on a modified RPA have been proposed by Chakrabarti et al. [17]. One using parallel 2-D filters where high degree of computational parallelism is achieved at the expense of less efficient hardware utilization. Second using SIMD 2-D architecture requires a reconfigured organization of the array as the processing moves on to higher decomposition levels. Cheng et al. [18] have proposed an architecture which improves the processing speed at the expense of increased hardware by using a number of parallel FIR filters with a polyphase structure. Hung et al. [19] have proposed an architecture that is a pipeline of one stage of parallel multipliers and two stages of accumulators to perform the accumulation tasks of the filters in each of the two directions. This architecture provides a reduced count of multipliers and to facilitate the processing of the boundary data. The processing speed of this architecture is low as same architecture is utilized recursively to perform the tasks of successive decomposition levels. Marino [21] has proposed a two-stage pipeline architecture which provides short computation time where first stage performs the task of the first decomposition level and the second one that of all the remaining levels. The complexity of the hardware resources is high and design is complicated as the processing units employed in this architecture differ from one another.

References:

  1. G.Eason,B.Noble,andI.N.Sneddon,“OncertainintegralsofLipsch itz-HankeltypeinvolvingproductsofBesselfunctions,” Phil.Trans.Roy.Soc.London,vol.A247,pp.529-551, April 1955.
  2. J.ClerkMaxwell,ATreatiseonElectricityandMagnetism,3rded.,v ol.2.Oxford:Clarendon,1892,pp.68–73.
  3. I.S.JacobsandC.P.Bean,“Fineparticles,thinfilmsandexchangeani sotropy,”inMagnetism,vol.III,G.T.RadoandH.Suhl,Eds.NewYo rk:Academic,1963,pp.271–350.
  4. K.Elissa, “Titleofpaperifknown,” unpublished.
  5. R.Nicole, “Title of paper with only first word capitalized,” J.NameStand.Abbrev.,in press.
  6. Y.Yorozu,M.Hirano,K.Oka,andY.Tagawa,“Electronspectrosco pystudiesonmagneto-opticalmediaandplasticsubstrateinterface,” IEEETransl.J.Magn.Japan,vol.2,pp.740–741,August1987 [Digests9thAnnualConf.MagneticsJapan,p.30,1982].
  7. P. K. Meher, B. K. Mohanty, and J. C. Patra, “Hardwareefficient systolic- like modular design for two-dimensional discrete wavelet transform,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 2, pp. 151–155, Feb. 2008.
  8. A. Benkrid, D. Crookes, and K. Benkrid, “Design and implementation of a generic 2-D orthogonal discrete wavelet transform on an FPGA,” in Proc. IEEE 9th Symp. Fieldprogramming Custom Computing Machines (FCCM), Apr. 2001, pp. 190–198.
  9. P. McCanny, S. Masud, and J. McCanny, “Design and implementation of the symmetrically extended 2-D wavelet transform,” in Proc. IEEE Int. Conf. Acoustic, Speech, Signal Process. (ICASSP), 2002, vol. 3, pp. 3108–3111.
  10. S. Raghunath and S. M. Aziz, “High speed area efficient multiresolution 2-D 9/7 filter DWT processor,” in Proc. Int. Conf. Very Large Scale Integration (IFIP), Oct. 2006, vol. 16–18, pp. 210–215.
  11. M. Angelopoulou, K. Masselos, P. Cheung, and Y. Andreopoulos, “A comparison of 2-D discrete wavelet transform computation schedules on FPGAs,” in Proc. IEEE Int. Conf. Field Programmable Technology (FPT), Bangkok, Tailand, Dec. 2006, pp. 181–188.
  12. C. Chrysytis and A. Ortega, “Line-based, reduced memory, wavelet image compression,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 3, pp. 378–389, Mar. 2000.
  13. M. Ravasi, L. Tenze, andM.Mattavelli, “A scalable and programmable architecture for 2-D DWT decoding,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 8, pp. 671– 677, Aug. 2002.
  14. K. G. Oweiss, A. Mason, Y. Suhail, A. M. Kamboh, and K. E. Thomson, “A scalable wavelet transform VLSI architecture for real-time signal processing in high-density intra-cortical implants,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 6, pp. 1266–1278, Jun. 2007.
  15. G. Shi, W. Liu, L. Zhang, and F. Li, “An efficient folded architecture for lifting-based discrete wavelet transform,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 4, pp. 290–294, Apr. 2009.
  16. M. Alam,W. Badawy, V. Dimitrov, and G. Jullien, “An efficient architecture for a lifted 2-D biorthogonal DWT,” J. VLSI Signal Process., vol. 40, pp. 333–342, 2005.
  17. C. Chakrabarti and M. Vishwanath, “Efficient realizations of the discrete and continuous wavelet transforms: From single chip implementations to mapping on SIMD array computers,” IEEE Trans. Signal Process., vol. 43, no. 3, pp. 759–771, Mar. 1995.
  18. C. Cheng and K. K. Parhi, “High-speed VLSI implementation of 2-D discrete wavelet transform,” IEEE Trans. Signal Process., vol. 56, no. 1, pp. 393–403, Jan. 2008.
  19. K. C. Hung, Y. S. Hung, and Y. J. Huang, “A nonseparable VLSI architecture for two-dimensional discrete periodized wavelet transform,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 9, no. 5, pp. 565–576, Oct. 2001.
  20. I. S. Uzun and A. Amira, “Rapid prototyping—framework for FPGAbased discrete biorthogonal wavelet transforms implementation,” IEE Vision, Image Signal Process., vol. 153, no. 6, pp. 721–734, Dec. 2006.
  21. F. Marino, “Efficient high-speed low-power pipelined architecture for the direct 2-D discrete wavelet transform,” IEEE Trans. Circuits Syst II, Analog. Digit. Signal Process., vol. 47, no. 12, pp. 1476–1491, Dec. 2000.
  22. R. J.C. Palero, R. G. Gironez, andA. S. Cortes, “Anovel FPGAarchitecture of a 2-D wavelet transform,” J. VLSI Signal Process., vol. 42, pp. 273–284, 2006.
  23. Q. Dai, X. Chen, and C. Lin, “A novel VLSI architecture for multidimensional discrete wavelet transform,” IEEE Trans. Circuits Syst.Video Technol., vol. 14, no. 8, pp. 1105–1110, Aug. 2004.
  24. M. H. Sheu,M. D. Shieh, and S. W. Liu, “A low cost VLSI architecture design for nonseparable 2-D discrete wavelet transform,” in Proc. 40th Midwest Symp. Circuits Syst, 1997, vol. 2, pp. 1217–1220.
  25. C. Y. Chen, Z. L.Yang, T. C.Wamg, and L.G. Chen, “A programmable VLSI architecture for 2-D discrete wavelet transform,” in Proc. IEEE Int. Symp. Circuits Systems (ISCAS), Geneva, Switzerland,May 28–31 2000, vol. 1, pp. 619–622.
  26. B. K. Mohanty and P. K. Meher, “Bit-serial systolic architecture for 2-D non-separable dircrete wavelet transform,” in Proc. Int. Conf. Intell. Adv. Syst. (ICIAS), Kualalampur, Malaysia, Nov. 2007.
  27. C. Yu and S.-J. Chen, “VLSI implementation of 2-D discrete wavelet transform for real-time video signal processing,” IEEE Trans. Consum. Electron., vol. 43, no. 4, Nov. 1997.
  28. M. Vishwanath, “The recursive pyramid algorithm for the discrete wavelet transforms,” IEEE Trans. Signal Process., vol. 42, no. 3, pp.673–677, 1994.
  29. K. A. Kotteri, S. Barua, A. E. Bell, and J. E. Carletta, “A comparison of hardware implementations of the biorthogonal 9/7 DWT: Convolution versus lifting,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 52, no. 5, pp. 256–260, May 2006.
  30. M. Ferretti and D. Rizzo, “Handling borders in systolic architectures for the 1-D discrete wavelet transform for perfect reconstruction,” IEEE Trans. Signal Process., vol. 48, no. 5, pp. 1365–1378, May 2000.
  31. D.Guevorkian, P. Liuha, A. Launiainen, and V. Lappalainen, “Architectures for Discrete Wavelet Transforms,” U.S. 6976046, Dec. 13,2005.
  32. J. Song and I. Park, “Pipelined discrete wavelet transform architecture scanning dual lines,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 12, pp. 916–920, Dec. 2009.
  33. C. Zhang, C.Wang, and M. O. Ahmad, “A VLSI architecture for a fast computation of the 2-D discrete wavelet transform,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2007.
  34. C. Zhang, C. Wang, and M. O. Ahmad, “An efficient bufferbased architecture for on-line computation of 1-D discrete wavelet transform,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP),May 2004, vol. 5, pp. 201–204.
  35. C. Zhang, C. Wang, and M. O. Ahmad, “A VLSI architecture for a high-speed computation of the 1-D discrete wavelet transform,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2005, vol. 2, pp. 1461–1464.
  36. C. Zhang, C. Wang, and M. O. Ahmad, “A pipeline VLSI architecture for high-speed computation of the 1-D discrete wavelet transform,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 10, pp. 2729–2740, Oct. 2010.