Creating An Image from a Portion of an Online Newspaper Page
by Cliff Lamere May 2011
The first image below is an example of what you can do with an important newspaper article you find on the internet. After creating them, I add the images to my genealogy program or send them to relatives and/or friends in an email.
Image 1. This is just a small portion of a newspaper page that you might see on the internet. The page would be in .pdf format. If you did a search for a name in order to locate the article on the page, the name would be highlighted in color. You don't want to see the color, so you must click on the webpage to eliminate the color. Click outside the article, because wherever you click a vertical line will likely appear. You don't want to have that in the article either.
CHANGE SIZE OF ARTICLE
You want to make the text large enough to read, but not so large that the article goes beyond the edges of the screen. However, there are times when the article cannot be kept entirely on a single screen.
PRINT SCREEN
On the keyboard of your computer you will find a key called Print Screen. When you strike it, it makes a copy of everything visible at that moment on your monitor's screen. You can then open a graphics program, click on the screen, then go to Edit/Paste. What you had been looking at will appear in the graphics program, but at the top of the screen the name of the graphics program and its controls will still be visible (and they still work). Immediately below that you will see the top of the image. It will show the name of your internet browser (like Internet Explorer) and its controls, but since they are part of an image, they can't be used.
GRAPHICS PROGRAM I USE
I use IrfanView. I used to use MS Paint which came free with each of my previous PC computers. However, if you save an image with it, it adds speckles around objects that have a light background behind them. As far as I know, IrfanView does not do that.
My intructions are for IrfanView 4.28 (released December 16, 2010). Your graphics program may do just as well or better, but the features may have different names from those in IrfanView.
Click here if you wish to download IrfanView. It will send you to another site. Be alert when you get there. Multiple DOWNLOAD buttons try to trick you into downloading some other program(s). Read the fine print, if any, around the Download button to be sure you are getting IrfanView instead of some other program.
DRAW BOX AROUND ARTICLE
Image 2. You can see the rectangle drawn around the article. To make it, just put your cursor in the upper left corner of the article. Hold the cursor down, and move the mouse to the lower right corner. As you move the mouse, the rectangle forms. Let up on mouse button when you get it positioned to your liking. It may require more than one attempt to get it just right.
CROP THE ARTICLE
Image 3. Article has been cropped. To do this click Edit/Crop selection. Anything outside the rectangle will disappear. Notice how narrow the margins are at this point. The margins should be wider so that any text next to the image (wherever you use the image) will not be too close to the text of the article. Let's add some white space (white border).
ADD WHITE BORDER
Image 4. White margins have been added; 60 pixels on the bottom and 10 pixels each on the other three sides. To do this click on Image/Change canvas size. You may prefer different border widths.
SAVE IMAGE
Saving the image at this point means you won't have to start at the very beginning if something goes wrong as you continue working with the image. In case of a problem, just open the saved version and start from the point.
DRAW BOX IN BOTTOM WHITE AREA
Image 5. Draw a box in the bottom white area.
TYPE TEXT THAT WILL GO INTO BOX
Image 6. Click on Edit/Insert text into selection. Image 6 appears. Type the newspaper's name, plus the day of the week and the date. Recording the source and date will allow you to come back to the newspaper page or do further research at a later time. The day of the week can help, especially with obituaries which only tell the day of the week when the death occurred.
On this window, you should set the background color to white, choose Center for the Text alignment, and choose whichever font and size you prefer. Click OK. If the text doesn't fit into the box, you may need to choose a smaller font size or increase the bottom white space. If the article is two columns wide, you may prefer all typed text to be on a single line.
SAVE THE IMAGE
Image 7. Image will now look like this. SAVE the image with the same name as earlier. The rectangle will disappear.
FINISHED
Image 8. This is what the final image looks like.
FINDING THE DATE OF THE ARTICLE
First, look at the top of the newspaper page. If there is no date at the top of the page, or if it is not readable, you have to try the following. Go back to the search page and look at the hit you used. The link was actually the name of a pdf file. Copy it. Paste it into the search box. Just before the .pdf is a number. Decrease that number by 1, which will be the number of the page before the one that had the article. In the worst cases, where the top of each page is cut off, you may have to go all the way back to the front page of the newspaper. There, the date is below the name of the newspaper.
RETRIEVING THE ARTICLE AS TEXT
While viewing the pdf newspaper page, you can use your mouse to highlight text in the article that interests you. Click on Edit/Copy (or Ctrl-c) to copy the text. Paste it into the program where you want to store the information (Edit/Paste or Ctrl-v). The text will not be accurate, so you will have to revise it. If the lines at the edge of the columns are broken, some text will be copied from other articles. You may have to copy one line at a time. For your records, also record the website address. Personally, I also record the name of the pdf file. It is not always easy to find the article later, because the words you use in the search may not have been interpreted correctly by the OCR program.
IMPROVING YOUR SEARCHES
The New York State newspaper pages you will be searching were microfilmed. Later, they were converted to pdf files. The OCR program makes lots of mistakes, because it is not reading optimum quality material.
The letter i is a problem, because the dot is often too close to the rest of the letter. The OCR program sometimes thinks it is the letter l. Sometimes the letter e is interpreted as a letter c or o.
My main research is with the surname
Gardenier in New York State. Gardinier, a later spelling variation, is now
more common than the original spelling. Searching on one newspaper site
for Gardinier, I got 5000 hits (there were more, but 5000 is the maximum shown)
2469 hits occurred when I replaced the first i in Gardinier with an l
3351 hits occurred when I replaced the second i in Gardinier with an l
4460 hits occurred when I replaced both i's with an l
Changing Gardinier to Gardimer and
Gardmier (assuming the OCR would read the -ni- or -in- as m) got me an extra 212
and 199 hits. Changing Gardenier to Gardemer got 194 more.
Using these misspellings in my searches has tremendously increased my chances of finding what I was searching for.
Dust on the microfilm will become
part of the image of the newspaper page. That might cause the OCR program
to read Helen
as Helfen or Heten.
OTHER THINGS I DO
If an article is long and in a single column, after adding the white space at the bottom, I sometimes cut the article into two equal pieces and save them (I call them -pt1 and -pt2). Using Image/Create Panorama image, and making sure Horizontal is the option selected, I merge the two parts into a single article that will look like the following.
REMOVING SPOTS & MICROFILM SCRATCHES
After drawing the box in image 5 above, in image 6 I typed the name and date of the newspaper. The text appeared in the box. BUT, if I draw a small box around a couple spots, then DON'T type anything in image 6, when you hit OK the box you drew is filled with white space. The spots disappear.
Microfilm scratches appear as black lines between lines of text or even through some text. With an article that is very important to me, I sometimes draw a tiny box between words that have a line through them, then I get rid of the line between the words.
It is possible to make the image larger, which makes drawing the little boxes easier. When finished, you can reduce the image to the desired size.
SOME FREE NEW YORK STATE NEWSPAPER WEBSITES
New York State Historical Newspaper Pages (over 15 million newspaper pages from around the state)
Northern New York Historical Newspapers (Counties of Essex, Clinton, Franklin, St. Lawrence, Jefferson, Lewis and Oswego).
Altamont Enterprise (Albany County)
Putnam County Courier (Putnam County)
An assortment of New York State newspaper webpages including some that are not free.
Visitors since 30 May 2011