These puzzles involve reading in an image, modifying it in some way, then writing out a new image file.
[H-80]
Write a main program that prints information
about an image file to the terminal. Deal only with two types of image files:
PGM and PPM. For any other type of file write an error message. Echo comment
lines (if any) that are in the image file.
Here are some examples of the program's output:
C:\PuzzleFolder>info gray240.pgm type: PGM # comment line one # comment line two rows: 200 cols: 300 levels: 256 C:\PuzzleFolder>info colorImage.ppm type: PPM rows: 200 cols: 300 levels: 256 C:\PuzzleFolder>info stuff.txt type: unknown
It is the possible presence of comments that makes this a hard puzzle.
To make the puzzle easier, you could start with readPGMimage()
and modify it.
Review the format of PGM image files in the section on image file formats, summarized here:
ASCII characters: 'P' followed by '5' followed by one or more whitespace characters (usually ending with a newline character). Zero or more comment lines, each starting with a # and ending with a newline character. Width of image in pixels (ASCII characters indicating a decimal value) followed by one or more whitespace characters (usually ending with a newline character). Height of image in pixels (as ASCII characters indicating a decimal value) followed by one or more whitespace characters (usually ending with a newline character). Maximum value of a pixel (as ASCII characters indicating a decimal value) followed by a single whitespace character, usually a newline. For us the maximum value will always be 255. The pixel data, one byte per pixel, using unsigned binary.
The headers for Portable Pixel Map PPM images look the same as PGM except that the file starts with the two ASCII characters "P6". The pixel data has three colors per pixel rather than one value per pixel. For us, the maximum value of a pixel (both for PGM and PPM) will always be 255, which means that the pixel data will be one byte per pixel (for PGM) or three bytes per pixel (for PPM). But this program does not need to read the pixel data since it just reports on the header information.
[H-80]
Write readPGMimage()
from scratch.
The prototype of the function is:
void readPGMimage( image *img, char *filename )
where img
is a pointer to an image
struct, and filename
is a pointer to a disk file.
The answer to the previous puzzle is a good starting point for working on this
one. Here is a skeleton of readPGMimage()
.
void readPGMimage( image *img, char *filename ) { FILE *file; char buffer[1024]; int nrows, ncols, ngray ; /* open the image file for reading binary */ /* read signature on first line */ /* skip over comment lines */ /* get ncols, nrows, ngray */ /* construct an image structure of the appropriate size */ if ( !newImage( img, nrows, ncols ) ) { printf("readImage(): newImage failed\n" ); system( "pause" ); exit( EXIT_FAILURE ); } /* read the pixel data */ fread( img->pixels, 1, img->nrows*img->ncols, file ); /* close the file */ fclose( file ); }
A sensible way to test your function is to use it in a program that makes a copy of an image file:
int main ( int argc, char* argv[] ) { image img ; if ( argc != 3 ) { printf("copyImage oldImage newImage\n"); system( "pause" ); exit( EXIT_FAILURE ); } /* read in the old image */ readPGMimage( &img, argv[1]); /* write the image to disk and free memory */ writePGMimage( img, argv[2]); freeImage( &img ); system( "pause" ); }
There is no particular use for the image copy program since any method of making a copy of a binary disk file will work for a disk file that happens to be an image.
[M-15]
Read in an image from disk. Add a constant
value to each pixel of the image. Write back the result as a new image file.
C:\PuzzleFolder>addConst oldImage newImage constant
The constant might be positive or negative. If adding constant
to a pixel results in a value greater than 255 (the maximum that can be held
in a byte), set the pixel value to 255. If adding constant
to a
pixel results in a negative value, set the pixel to zero. Here is an original
gray level image:
original | 50 added to each pixel |
[M-15]
Read in an image from disk. Multiply each
pixel of the image by a scale factor. Write back the result as a new image file.
C:\PuzzleFolder>multConst oldImage newImage factor
If multiplying by factor
results in a value greater than 255
, set the pixel value to 255. Here is an original gray level image:
original | Each pixel scaled by 0.75 |
[M-15]
Write a program that reads in an image and
computes the average gray level of all the pixels and the standard deviation
of the gray level of all the pixels. Use double precision arithmetic for this.
The average gray level is the sum of all the pixels divided by the number of pixels. The standard deviation of the gray level is
sqrt( averageSquare - average*average )
where average
is the average gray level, and averageSquare
is the average of the square of the gray levels. Compute averageSquare
by squaring each pixel as it is read in and adding that square to a sum of squares.
After all pixels have been read, divide that sum by the number of pixels. The
standard deviation should always be zero or positive.
For example:
C:\PuzzleFolder>statistics kernighan.pgm average: 121.026401 standard deviation: 52.950776 Press any key to continue . . . C:\PuzzleFolder>statistics kernighanPlus50.pgm average: 170.956984 standard deviation: 52.829143 Press any key to continue . . . C:\PuzzleFolder>statistics gray240.pgm average: 240.000000 standard deviation: 0.000000 Press any key to continue . . .
(Because some pixels in kernighanPlus50.pgm
were clampled at the
value 255, the average is not exactly 50 more than the original image.)
[H-35]
For a gray level image with one byte per
pixel, a histogram is an array with cells indexed 0 to 255. Each cell contains
the count of pixels in the image that have the corresponding gray level. For
example, cell[124] contains the number of pixels that have value 124.
Often histograms are displayed graphically. Here is a histogram of the kernighan.pgm image:
| * | *** | ***** | ****** * | ******** * | ** ********* * ** | *** *********** ***** | *** ********************* | *** **** * *********************** | **** *************************************** | ************************************************ | ************************************************ | ************************************************ | ************************************************* | ************************************************** ****************************************************************************************************** average: 120.624144 standard deviation: 55.161878
Histograms are useful for inspecting images. For example, the effect of the add constant program on an image can be seen through the histogram of the new image:
| * | *** * | ***** * | ******* ** | ******* ** | ** ********* *** | ** *********** ***** | *** *************** ****** | *** *** * * * ********************** | **** *************************************** | *********************************************** | *********************************************** | ************************************************ | ************************************************ | ************************************************ ****************************************************************************************************** average: 170.542393 standard deviation: 55.026415
Write a function
void histogram ( image img, int histo[] )
that computes a histogram of an image (ie. computes just the array of pixel counts). This should be easy. Then write a function
void displayHisto( int histo[] )
that prints a histogram to the screen. This will be hard to do. A crude version prints 256 lines, where each line contains the gray level and the number of pixels at that level. This is just a print-out of the array. A somewhat better version prints the gray level, the number of pixels at that level, and a horizontal row of stars proportional to the value. A version that prints vertical barslike the above, is much more work. My version is about 50 lines long.
[M-10]
Write a function that makes a copy of an
image struct.
void copyImage( image source, image *copy )
A new image struct is created with
the same number of rows and columns as the source image struct
and a new array of pixels,
but containing the same values as in the source.
Use newImage()
to allocate the new pixels.
Then copy pixel values from
the source image to the copy.
The function does not make a copy of an image file, but a copy of image struct to image struct. This can be useful in programs that perform an operation on an image but need to keep the original unchanged.
[M-20]
Write a function that transforms the gray
levels of an image. Pixels in the range A to B are transformed into the range
0 to 255. Use a linear function for this. Pixels below value A are set to zero,
and pixels above blue B are set to 255.
For each input pixel p
do this by first
checking if it is out of range and setting p'
to zero or 255 if it is.
Otherwise, compute a value from 0.0 to 1.0:
p' = (p-A) / (B-A)
Then scale that to 0 to 255:
p' = (p-A) / (B-A) * 255
This produces a transformation as seen in the following graph:
linear transformation |
[M-15]
What if each pixel value p in an image was transformed into value p' using this formula:
p' = log256(p+1)
The term log256(p+1)
means to take the logarithm base 256 of
the value p.
For any base, the log of zero is undefined,
so using (p+1)
avoids that problem.
Now values are in the range 1 to 256.
The log256
of 1 is zero.
The log256
of 256 is one.
So input gray levels in the range 0 to 255 are transformed into output levels
in the range 0.00 to 1.00, as seen in the following graph:
log transformation |
However, it will not work to have output pixels in the range of 0.00 to 1.00 because fractional values are not allowed, and a pixel value of 1 is very dim. So scale the output pixels by multiplying them by 255. This results in a value that is again in the range 0 to 255, as seen in this graph:
255 * log transformation |
So now input pixels in the range 0 to 255 are transformed into output pixels in the range 0 to 255, but the transformation is not linear. Most of the output values are used for the lowest input values. For example, the first half of input values are transformed into the first 224 output values. This changes the image by brightening the darker pixels and fitting the bright pixels into a narrower range.
Write a function
void logTransform( image img )
that performs this transformation.
You will need to compute log256(p+1)
.
Use the formula
log256x = ln x / ln 256
where ln x
is the natural logarithm of x
.
Now write a complete program that inputs an image,
transforms it,
and writes out the transformed image to a new file.
C:\PuzzleFolder>logTransform kernighan.pgm kernighanLog.pgm
The transformation might help with under-exposed images. With an image that has balanced gray levels, the transformed images looks over-exposed:
Log Transformed Image |
You can play with images. First scale an image (Puzzle 29) to darken it, then brighten it with this transformation.
[M-10]
What if each pixel value p in an image was transformed into value p' using this formula:
p' = exp(p)
where exp(p)
means e
to the power p
.
But, the value exp(255)
is very large.
So arrange things to compute a value from 1.0 to exp(k)
where k
is a parameter.
The value k=5
works well
exp( k*(p/255) )
Now transform this into values from 0.0 to (exp(k)-1)
exp( k*(p/255) ) - 1.0
Now scale this so that the values are 0.0 to 1.0:
(exp( k*(p/255) ) - 1.0) / (exp( k ) - 1.0)
Then further scale this so that the output values are 0 to 255
255*(exp( k*(p/255) ) - 1.0) / (exp( k ) - 1.0)
The resulting image emphasizes the brighter pixels at the expense of the dimmer pixels. In the following, the transformed image was stretched (using Puzzle 33) after the exponential transformation.
Original | Exp Transformed and Stretched |
Use the histogram display (Puzzle 31) to decide on the parameters to use for stretching.
[M-20]
Create an output image that is the result of adding a scaled version of one
image to a scaled version of another. Say that the first image is imageA and
the second image is imageB and the resulting image is imageC.
Then for each location [r][c],
imageC[r][c] = scaleA*imageA[r][c] + scaleB*imageB[r][c]
Usually scaleA + scaleB = 1.0. Often, both scale factors are 0.5 so each input image contributes equally to the result image. As usual, clamp the resulting values so that the fall in the range 0..255. The parameters to the program are the names of the images and the scale factors:
C:\PuzzleFolder>addImage imageA scaleA imageB scaleB resultImage
Both images must have the same number of rows and same number of columns. For example, here the portait image is added to a ramp image:
kernRamp.pgm |
Subtraction of one image from another can be done by using a negative scale.
[M-15]
Create a new image that is half the size of the original:
C:\PuzzleFolder>halfSize original.pgm half.pgm
kernHalf.pgm |
Do this by using the upper left pixel of each 2x2 block of pixels in the original image. If there is an odd number of rows or columns in the original image, ignore the last row or column.
Now write a program that doubles the size of an image
C:\PuzzleFolder>doubleSize original.pgm double.pgm
Do this by writing a 2x2 block of pixels for each pixel in the original image. In digital phography, this is called a "digital zoom". Although the image has doubled in size, there is no new information. Here is the result when the above half-size image is doubled back to its original size:
kernHalfDouble.pgm |
The two images (this and the one above) the same amount of information in them, which is one fourth the amount of information that was in the original image. (One fourth, because halving each dimension results in one quarter the area.)
For extra fun, halve the size of an image twice, then double the size of the image twice. Although the final image will be the same size as the original. it will have one eighth the information that was in the original.
[H-30]
Create a new image that is twice the size of the original:
C:\PuzzleFolder>doubleSizeAverage original.pgm doubleAvg.pgm
Do this by writing a 2x2 block of pixels for each pixel in the original image. However, now make each of the four pixels in the block different. The upper left pixel is the same as the original pixel, the upper right pixel is the average of the original pixel and the next pixel in the same row, the lower left pixel is the average of the original pixel and the next pixel in the same column, and the lower right pixel is the average of the original pixel and the pixel to its lower right. For pixels in the last row or last column of the original image, write out a 2x2 block of the original pixel.
Output Scheme |
This makes the resulting zoomed image look smoother, but does not actually add information to the zoomed image. The image below is a double-smoothed version of the half-sized image.
Doubled Smoothed |
Another way (easier) to achieve a similar effect is to create the doubled image using puzzle 38, then to smooth that image with an image smoothing function.
[M-20]
Write a program that reflects the pixels of an image the vertical axis or
across the horizontal axis.
For reflection across the vertical axis, think of a vertical line down the
center of the image.
If a pixel was originally X columns to the right of this axis, after
reflection it is X columns to the left of it.
Image Reflection | Result |
The command line for the program should look like:
C:\PuzzleFolder>reflect original.pgm reflected.pgm [H|V]
Where [H|V]
means the last parameter is a single character H or V
to call for Horizontal or Vertical reflection.
The solution to this puzzle is left as an exercise for the reader.
Compute the root mean square difference between two images
of the same size.
This is done by simultaneously scanning through both images
and examining corresponding pixels.
Say that p
is the pixel from one image,
and that q
is the pixel from the other.
Then (p-q)
is the difference between the pixels.
The squared difference (p-q)2
will always be zero or positive.
Compute the mean squared difference by adding up
the squared differences for all the pixels and dividing by
the number of pixels in one image.
The square root of that is the RMS difference between the
two images.