# VHDL Simulation of DWT for Low Level Image Processing Application

Full Text Download |

**Abstract**

Discrete wavelet transform (DWT) is the one of
the main approaches used for image compression. A new
discrete wavelet transform (DWT) architecture is
proposed in this paper to realize a memory-efficient 2D
DWT unit. The main goal of the proposed system is to
change the processing unit in the pipelined DWT.
Processing unit is modified in such a way that total
number of processing required in the system will be
reduced. The hardware is efficiently reduced by using the
concept of intra-stage parallelism and inter-stage parallelism.
The intra-stage parallelism is obtained by dividing the 2D
filtering operation into four tasks. The multi decomposition
levels in the stage of pipeline are mapped by computational
task in inter-stage parallelism. To maintain the critical path
delay serially concatenated additions are optimized by
changing computation topology and applying arithmetic
optimization. The Proposed architecture computes DWT
efficiently with less clock cycles. The hardware complexity
of 2D DWT is significantly reduced.

**Keywords:**discrete wavelet transforms, Computational
parallelism, inter-stage parallelism, intra-stage parallelism,
multi resolution filtering.

**I.Introduction **

The multi-resolution decomposition approach is an effective
approach to analyze the information present in the content of
the image.2D discrete wavelet transform gives the multiresolution
decomposition capability [1]. The 2D discrete
wavelet transform involves computation of large volumes of
data and processing them in various decomposition levels is
overhead. Earlier, many studies have been done to improve
the performance of 2-D DWT computation which effectively
utilizes the hardware resources. The architectures proposed
earlier can be broadly classified into separable [2]–[16] and
non-separable architectures [17]–[27]. Earlier can be broadly
classified into separable [2]–[16] and non-separable
architectures [17]–[27].

A separable architecture is one where a 2-D filtering
operation is divided into two 1-D filtering operations, one for
processing the data row-wise and the other column-wise. A
low-storage short-latency separable architecture where the
row-wise operations are performed by systolic filters and the
column-wise operations are performed in parallel filters has
been proposed in [2]. This architecture requires complex
control units to facilitate the interleaved operations of the
output samples of different decomposition levels by
employing a recursive pyramid algorithm (RPA) [3].
A scheme which leads to low complexity architecture with
large latency have been proposed by Liao et al. [3] in which
each of the row- and column-wise filtering operations are
decomposed using the so called lifting operations [29] into a
cascade of sub-filtering operations. As the 2D transforms are
computed directly by using 2D filters in non-separable
architectures, they do not have this problem. Two nonseparable
architectures based on a modified RPA have been
proposed by Chakrabarti et al. [17]. One using parallel 2-D
filters where high degree of computational parallelism is
achieved at the expense of less efficient hardware utilization.
Second using SIMD 2-D architecture requires a reconfigured
organization of the array as the processing moves on to
higher decomposition levels. Cheng et al. [18] have proposed
an architecture which improves the processing speed at the
expense of increased hardware by using a number of parallel
FIR filters with a polyphase structure. Hung et al. [19] have
proposed an architecture that is a pipeline of one stage of
parallel multipliers and two stages of accumulators to
perform the accumulation tasks of the filters in each of the
two directions. This architecture provides a reduced count of
multipliers and to facilitate the processing of the boundary
data. The processing speed of this architecture is low as same
architecture is utilized recursively to perform the tasks of
successive decomposition levels. Marino [21] has proposed a
two-stage pipeline architecture which provides short
computation time where first stage performs the task of the
first decomposition level and the second one that of all the
remaining levels. The complexity of the hardware resources
is high and design is complicated as the processing units
employed in this architecture differ from one another.

## References:

- G.Eason,B.Noble,andI.N.Sneddon,“OncertainintegralsofLipsch itz-HankeltypeinvolvingproductsofBesselfunctions,” Phil.Trans.Roy.Soc.London,vol.A247,pp.529-551, April 1955.
- J.ClerkMaxwell,ATreatiseonElectricityandMagnetism,3rded.,v ol.2.Oxford:Clarendon,1892,pp.68–73.
- I.S.JacobsandC.P.Bean,“Fineparticles,thinfilmsandexchangeani sotropy,”inMagnetism,vol.III,G.T.RadoandH.Suhl,Eds.NewYo rk:Academic,1963,pp.271–350.
- K.Elissa, “Titleofpaperifknown,” unpublished.
- R.Nicole, “Title of paper with only first word capitalized,” J.NameStand.Abbrev.,in press.
- Y.Yorozu,M.Hirano,K.Oka,andY.Tagawa,“Electronspectrosco pystudiesonmagneto-opticalmediaandplasticsubstrateinterface,” IEEETransl.J.Magn.Japan,vol.2,pp.740–741,August1987 [Digests9thAnnualConf.MagneticsJapan,p.30,1982].
- P. K. Meher, B. K. Mohanty, and J. C. Patra, “Hardwareefficient systolic- like modular design for two-dimensional discrete wavelet transform,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 2, pp. 151–155, Feb. 2008.
- A. Benkrid, D. Crookes, and K. Benkrid, “Design and implementation of a generic 2-D orthogonal discrete wavelet transform on an FPGA,” in Proc. IEEE 9th Symp. Fieldprogramming Custom Computing Machines (FCCM), Apr. 2001, pp. 190–198.
- P. McCanny, S. Masud, and J. McCanny, “Design and implementation of the symmetrically extended 2-D wavelet transform,” in Proc. IEEE Int. Conf. Acoustic, Speech, Signal Process. (ICASSP), 2002, vol. 3, pp. 3108–3111.
- S. Raghunath and S. M. Aziz, “High speed area efficient multiresolution 2-D 9/7 filter DWT processor,” in Proc. Int. Conf. Very Large Scale Integration (IFIP), Oct. 2006, vol. 16–18, pp. 210–215.
- M. Angelopoulou, K. Masselos, P. Cheung, and Y. Andreopoulos, “A comparison of 2-D discrete wavelet transform computation schedules on FPGAs,” in Proc. IEEE Int. Conf. Field Programmable Technology (FPT), Bangkok, Tailand, Dec. 2006, pp. 181–188.
- C. Chrysytis and A. Ortega, “Line-based, reduced memory, wavelet image compression,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 3, pp. 378–389, Mar. 2000.
- M. Ravasi, L. Tenze, andM.Mattavelli, “A scalable and programmable architecture for 2-D DWT decoding,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 8, pp. 671– 677, Aug. 2002.
- K. G. Oweiss, A. Mason, Y. Suhail, A. M. Kamboh, and K. E. Thomson, “A scalable wavelet transform VLSI architecture for real-time signal processing in high-density intra-cortical implants,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 6, pp. 1266–1278, Jun. 2007.
- G. Shi, W. Liu, L. Zhang, and F. Li, “An efficient folded architecture for lifting-based discrete wavelet transform,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 4, pp. 290–294, Apr. 2009.
- M. Alam,W. Badawy, V. Dimitrov, and G. Jullien, “An efficient architecture for a lifted 2-D biorthogonal DWT,” J. VLSI Signal Process., vol. 40, pp. 333–342, 2005.
- C. Chakrabarti and M. Vishwanath, “Efficient realizations of the discrete and continuous wavelet transforms: From single chip implementations to mapping on SIMD array computers,” IEEE Trans. Signal Process., vol. 43, no. 3, pp. 759–771, Mar. 1995.
- C. Cheng and K. K. Parhi, “High-speed VLSI implementation of 2-D discrete wavelet transform,” IEEE Trans. Signal Process., vol. 56, no. 1, pp. 393–403, Jan. 2008.
- K. C. Hung, Y. S. Hung, and Y. J. Huang, “A nonseparable VLSI architecture for two-dimensional discrete periodized wavelet transform,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 9, no. 5, pp. 565–576, Oct. 2001.
- I. S. Uzun and A. Amira, “Rapid prototyping—framework for FPGAbased discrete biorthogonal wavelet transforms implementation,” IEE Vision, Image Signal Process., vol. 153, no. 6, pp. 721–734, Dec. 2006.
- F. Marino, “Efficient high-speed low-power pipelined architecture for the direct 2-D discrete wavelet transform,” IEEE Trans. Circuits Syst II, Analog. Digit. Signal Process., vol. 47, no. 12, pp. 1476–1491, Dec. 2000.
- R. J.C. Palero, R. G. Gironez, andA. S. Cortes, “Anovel FPGAarchitecture of a 2-D wavelet transform,” J. VLSI Signal Process., vol. 42, pp. 273–284, 2006.
- Q. Dai, X. Chen, and C. Lin, “A novel VLSI architecture for multidimensional discrete wavelet transform,” IEEE Trans. Circuits Syst.Video Technol., vol. 14, no. 8, pp. 1105–1110, Aug. 2004.
- M. H. Sheu,M. D. Shieh, and S. W. Liu, “A low cost VLSI architecture design for nonseparable 2-D discrete wavelet transform,” in Proc. 40th Midwest Symp. Circuits Syst, 1997, vol. 2, pp. 1217–1220.
- C. Y. Chen, Z. L.Yang, T. C.Wamg, and L.G. Chen, “A programmable VLSI architecture for 2-D discrete wavelet transform,” in Proc. IEEE Int. Symp. Circuits Systems (ISCAS), Geneva, Switzerland,May 28–31 2000, vol. 1, pp. 619–622.
- B. K. Mohanty and P. K. Meher, “Bit-serial systolic architecture for 2-D non-separable dircrete wavelet transform,” in Proc. Int. Conf. Intell. Adv. Syst. (ICIAS), Kualalampur, Malaysia, Nov. 2007.
- C. Yu and S.-J. Chen, “VLSI implementation of 2-D discrete wavelet transform for real-time video signal processing,” IEEE Trans. Consum. Electron., vol. 43, no. 4, Nov. 1997.
- M. Vishwanath, “The recursive pyramid algorithm for the discrete wavelet transforms,” IEEE Trans. Signal Process., vol. 42, no. 3, pp.673–677, 1994.
- K. A. Kotteri, S. Barua, A. E. Bell, and J. E. Carletta, “A comparison of hardware implementations of the biorthogonal 9/7 DWT: Convolution versus lifting,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 52, no. 5, pp. 256–260, May 2006.
- M. Ferretti and D. Rizzo, “Handling borders in systolic architectures for the 1-D discrete wavelet transform for perfect reconstruction,” IEEE Trans. Signal Process., vol. 48, no. 5, pp. 1365–1378, May 2000.
- D.Guevorkian, P. Liuha, A. Launiainen, and V. Lappalainen, “Architectures for Discrete Wavelet Transforms,” U.S. 6976046, Dec. 13,2005.
- J. Song and I. Park, “Pipelined discrete wavelet transform architecture scanning dual lines,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 12, pp. 916–920, Dec. 2009.
- C. Zhang, C.Wang, and M. O. Ahmad, “A VLSI architecture for a fast computation of the 2-D discrete wavelet transform,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2007.
- C. Zhang, C. Wang, and M. O. Ahmad, “An efficient bufferbased architecture for on-line computation of 1-D discrete wavelet transform,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process. (ICASSP),May 2004, vol. 5, pp. 201–204.
- C. Zhang, C. Wang, and M. O. Ahmad, “A VLSI architecture for a high-speed computation of the 1-D discrete wavelet transform,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2005, vol. 2, pp. 1461–1464.
- C. Zhang, C. Wang, and M. O. Ahmad, “A pipeline VLSI architecture for high-speed computation of the 1-D discrete wavelet transform,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 10, pp. 2729–2740, Oct. 2010.