Content based image retrieval (CBIR), a technique which uses the content like color, texture and shape to search images from the large scale databases, is an active research area. In this paper, de-duplication process of photographs was implemented using CBIR. The CBIR technique uses color histogram refinement feature. The photograph data was divided into different clusters using k-means clustering algorithm. The clusters count depends on the numbers of photographs in each district of the state. The photo de-duplication exercise was carried out in a large photograph database which contains 22 million (approximately) photograph images. The experimental results shows that there were 0.35 million (approximately) duplicate photographs. © 2011 IEEE.