[R] Parsing XML?

Spencer Graves @pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Thu Jul 28 00:28:16 CEST 2022



On 7/27/22 5:04 PM, avi.e.gross using gmail.com wrote:
> General XML is not intended to be parsable as a list. But there are lots of tools you can use to extract various patterns out of XML in forms like a list.
> 
> But your data example is huge and I am falling asleep waiting to see if it loads. I looked sideways and it is not that big directly but my browser may be trying to show it as a web page.


	  You have my sympathies.  It loaded with elapsed time of 0.55 seconds 
for me:


XMLfile <-
"https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml"
system.time(XMLdata <- xml2::read_xml(XMLfile))
#   user  system elapsed
#  0.048   0.010   0.550
XMLtxt <- xml2::xml_text(XMLdata)
nchar(XMLtxt)
#[1] 29415


	  From staring at those 29415 chracters, I noted that 
'info:srw/schema/1/marcxmlxml' seemed to repeat in places that looked 
like breaks between records.  So I tried the following:


str(XMLt2 <- strsplit(XMLtxt, 'info:srw/schema/1/marcxmlxml')[[1]])
head(XMLt2, 3)

[1] "1.12250" 
 
 
 
 
 

[2] "00000nas a22000007i 45001030438981180404c20159999aluwr n       0 
a0eng    2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, 
Selma, AL 36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL 
:North Shore Press, 
LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in 
2015.Description based on: Volume 2, Issue 40 (October 5, 2017) 
(surrogate); title from caption.Latest issue consulted: Volume 2, Issue 
40 (October 5, 2017).United StatesAlabamaDallasSelma."
[3] "00000cas a22000007a 4500502150053100127c20109999aluwr n       0 
a0eng    2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, 
LLC, 3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt. 
Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell, 
Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in 
2010.Description based on: Nov. 4, 2010 (surrogate); title from 
caption."


	  However, I was hoping there were other XML tools that would get me 
more information quicker.


	  Suggestions?
	  Thanks,
	  Spencer


############

 > sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6.7

Matrix products: default
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] sos_2.1-4     brew_1.0-7    XML_3.99-0.10 xml2_1.3.3

loaded via a namespace (and not attached):
  [1] digest_0.6.29   evaluate_0.15   rlang_1.0.4     cli_3.3.0
  [5] curl_4.3.2      rstudioapi_0.13 rmarkdown_2.14  tools_4.2.1
  [9] xfun_0.31       yaml_2.3.5      fastmap_1.1.0   compiler_4.2.1
[13] htmltools_0.5.2 knitr_1.39
 >
> 
> How about you copying and pasting a sample of say the first few dozen lines so we see what is in it for the purpose of ...
> 
> The schema would be mentioned in an attribute if you know what you are looking for and may be an external file.
> 
> So decide what you want, like a list of all titles and use something like xpath().
> 
> 
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Spencer Graves
> Sent: Wednesday, July 27, 2022 4:51 PM
> To: 'R-help' <r-help using r-project.org>
> Subject: [R] Parsing XML?
> 
> Hello, All:
> 
> 
> 	  What would you suggest I do to parse the following XML file into a
> list that I can understand:
> 
> 
> XMLfile <-
> "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml"
> 
> 
> 
> 	  This is the first of 6666 XML files containing "U.S. Newspaper
> Directory" maintained by the US Library of Congress discussed in the
> thread below.  I've tried various things using the XML and xml2.
> 
> 
> XMLdata <- xml2::read_xml(XMLfile)
> str(XMLdata)
> XMLdat <- XML::xmlParse(XMLdata)
> str(XMLdat)
> XMLtxt <- xml2::xml_text(XMLdata)
> nchar(XMLtxt)
> #[1] 29415
> 
> 
> 	  Someplace there's a schema for this.  I don't know if it's embedded
> in this XML file or in a separate file.  If it's in a separate file, how
> could I describe it to my contacts with the Library of Congress so they
> would understand what I needed and could help me get it.
> 
> 
> 	  Thanks,
> 	  Spencer Graves
> 
> 
> p.s.  All 29415 characters in XMLtext appear in the thread below.  	  	
> 
> 
> -------- Forwarded Message --------
> Subject: 	[Newspapers and Current Periodicals] How can I get counts of
> the numbers of newspapers by year in the US, and preferably also
> elsewhere? A search of "U.S. Newspaper Directory,
> Date: 	Wed, 27 Jul 2022 14:59:03 +0000
> From: 	Kerry Huller <serials using ask.loc.gov>
> To: 	Spencer Graves <spencer.graves using effectivedefense.org>
> CC: 	twes using loc.gov
> 
> 
> 
> --# Type your reply above this line #--
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 27 2022, 10:59am via System
> 
> Hello Spencer,
> 
> So, when I view the xml, I'm actually looking at it in XML editor
> software, so I can view the tags and it's structured neatly. I've copied
> and pasted the text from the beginning of the file and the first
> newspaper title below from my XML editor:
> 
> <?xml version="1.0" encoding="UTF-8" standalone="no"?>
> <?xml-stylesheet type='text/xsl'
> href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?>
> 
> <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"
> xmlns:oclcterms="http://purl.org/oclc/terms/"
> xmlns:dc="http://purl.org/dc/elements/1.1/"
> xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
> <version>1.1</version>
> <numberOfRecords>2250</numberOfRecords>
> <records>
> <record>
> <recordSchema>info:srw/schema/1/marcxml</recordSchema>
> <recordPacking>xml</recordPacking>
> <recordData>
> <record xmlns="http://www.loc.gov/MARC21/slim">
>        <leader>00000nas a22000007i 4500</leader>
>        <controlfield tag="001">1030438981</controlfield>
>        <controlfield tag="008">180404c20159999aluwr n       0   a0eng
>    </controlfield>
>        <datafield ind1=" " ind2=" " tag="010">
>          <subfield code="a">  2018200464</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="040">
>          <subfield code="a">DLC</subfield>
>          <subfield code="e">rda</subfield>
>          <subfield code="c">DLC</subfield>
>          <subfield code="b">eng</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="012">
>          <subfield code="m">1</subfield>
>        </datafield>
>        <datafield ind1="0" ind2=" " tag="022">
>          <subfield code="a">2577-5316</subfield>
>          <subfield code="2">1</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="032">
>          <subfield code="a">021110</subfield>
>          <subfield code="b">USPS</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="037">
>          <subfield code="b">711 Alabama Avenue, Selma, AL 36701</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="042">
>          <subfield code="a">nsdp</subfield>
>          <subfield code="a">pcc</subfield>
>        </datafield>
>        <datafield ind1="1" ind2="0" tag="050">
>          <subfield code="a">ISSN RECORD</subfield>
>        </datafield>
>        <datafield ind1="1" ind2="0" tag="082">
>          <subfield code="a">071</subfield>
>          <subfield code="2">15</subfield>
>        </datafield>
>        <datafield ind1=" " ind2="0" tag="222">
>          <subfield code="a">Selma sun</subfield>
>        </datafield>
>        <datafield ind1="0" ind2="0" tag="245">
>          <subfield code="a">Selma sun.</subfield>
>        </datafield>
>        <datafield ind1=" " ind2="1" tag="264">
>          <subfield code="a">Selma, AL :</subfield>
>          <subfield code="b">North Shore Press, LLC</subfield>
>          <subfield code="c">2016-</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="310">
>          <subfield code="a">Weekly</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="336">
>          <subfield code="a">text</subfield>
>          <subfield code="b">txt</subfield>
>          <subfield code="2">rdacontent</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="337">
>          <subfield code="a">unmediated</subfield>
>          <subfield code="b">n</subfield>
>          <subfield code="2">rdamedia</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="338">
>          <subfield code="a">volume</subfield>
>          <subfield code="b">nc</subfield>
>          <subfield code="2">rdacarrier</subfield>
>        </datafield>
>        <datafield ind1="1" ind2=" " tag="362">
>          <subfield code="a">Began in 2015.</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="588">
>          <subfield code="a">Description based on: Volume 2, Issue 40
> (October 5, 2017) (surrogate); title from caption.</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="588">
>          <subfield code="a">Latest issue consulted: Volume 2, Issue 40
> (October 5, 2017).</subfield>
>        </datafield>
>        <datafield ind1=" " ind2=" " tag="752">
>          <subfield code="a">United States</subfield>
>          <subfield code="b">Alabama</subfield>
>          <subfield code="c">Dallas</subfield>
>          <subfield code="d">Selma.</subfield>
>        </datafield>
>      </record>
> </recordData>
> </record>
> 
> When I view the records in the XML editor, these 2 lines below do begin
> each of the records for each individual title, but of course this is
> including the xml tags:
> 
> <recordSchema>info:srw/schema/1/marcxml</recordSchema>
> <recordPacking>xml</recordPacking>
> 
> Hopefully this helps you decide where to break or parse each record.
> 
> On another note, I just noticed as well that at the top of this first
> file it lists the total number of records for the Alabama grouping -
> 2250. This also appeared to be the case for the Alaska records when I
> took a look at the first one for that state. I imagine that should be
> consistent throughout each "grouping" of records.
> 
> Let me know if you have follow-up questions!
> 
> Best wishes,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 27 2022, 10:21am via Email
> 
> Hi, Kerry:
> 
> 
> Thanks. I understand the chunking in files of at most 50. I've read
> the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of
> 29415 characters, copied below. Might you have any suggestions on the
> next step in parsing this? Staring at it now, it looks splitting on
> "info:srw/schema/1/marcxmlxml" might convert the 29415 characters into
> shorter chunks, each of which could then be parsed further.
> 
> 
> This is not as bad as reading ancient Egyptian heiroglyphics without
> the Rosetta Stone, but I wondered if you might have something that could
> make this work easier and more reliable? I guess I could compare with
> what I already read as JSON ;-)
> 
> 
> Thanks,
> Spencer Graves
> 
> 
> "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
> 45001030438981180404c20159999aluwr n 0 a0eng
> 2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL
> 36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore
> Press,
> LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
> 2015.Description based on: Volume 2, Issue 40 (October 5, 2017)
> (surrogate); title from caption.Latest issue consulted: Volume 2, Issue
> 40 (October 5, 2017).United
> StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a
> 4500502150053100127c20109999aluwr n 0 a0eng
> 2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC,
> 3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt.
> Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell,
> Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in
> 2010.Description based on: Nov. 4, 2010 (surrogate); title from
> caption.info:srw/schema/1/marcxmlxml00000cas a22000007a
> 4500426491872090720c20099999alumr n 0 a0eng
> 2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU using 000044489617NZ116076352Devon
> Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183,
> Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan,
> Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with
> vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American
> Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from
> masthead.Applewhite, Devon.United StatesAlabama.United
> StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas
> a22000007a 4500289017315081219c20089999aluwr n | a0eng c
> 2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill Publications,
> LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville
> standardThe Greenville standard.Greenville, AL :Springhill
> PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1,
> issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15
> (Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec.
> 19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011)
> (surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a
> 4500123539969070426c20079999aluwr ne 0 a0eng c
> 2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune,
> 1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune
> (Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western
> tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description
> based on: May 23, 2007 (surrogate); title from
> caption.AU using 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a
> 4500226300653080425c20079999aluwr ne | a0eng
> 2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe
> corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor
> Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description
> based on: 1st issue.United StatesAlabamaWalkerCarbon
> Hill.http://www.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas a22000007a
> 450077560432070109c20069999aluwr ne 0 a0eng c
> 2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU using 000041190283The
> 
> Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN
> RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn
> Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July
> 20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee
> County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee
> County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United
> StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii
> 4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b
> s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at
> Birmingham.The eReporter.[Birmingham, Alabama] :The University of
> Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public
> Relations & Marketing and Information Technology1 online resource2
> issues weeklytexttxtrdacontentcomputercrdamediaonline
> resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official
> communication of The University of Alabama at Birmingham, companion to
> the UAB Reporter and recommended alternative to mass e-mails.\"Issues
> for <March 11, 2014- published and distributed via e-mail subscription
> on Tuesdays and Fridays.Description based on: September 19, 2006; title
> from title screen (viewed March 12, 2014).University of Alabama at
> BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of
> Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at
> Birmingham.Office of Public Relations and Marketing.University of
> Alabama at Birmingham.Information Technology.2006-2012, companion
> to:University of Alabama at Birmingham.UAB
> reporter.(OCoLC)32435748Archived
> issueshttp://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 4500166387050070829c20059999aluwr ne | a0eng c
> 2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial
> Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN
> RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke,
> Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial
> Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on:
> Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial
> Foundation.United
> StatesAlabamaRandolphRoanoke.AU using 000042141390info:srw/schema/1/marcxmlxml00000nas
> 
> a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng
> 2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy
> 72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson
> pressNorth Jackson press.Stevenson, AL :Caney Creek Publications
> LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription
> based on surrogate of: Volume 1, number 36 (October 11, 2019); title
> from masthead.Latest issue consulted: Volume 1, number 36 (October 11,
> 2019) (Surrogate).United
> StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas
> a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c
> 2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb
> news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan
> with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct.
> 28, 1998).Final issue consulted.Description based on first issue; title
> from caption.Decatur (Ga.)Newspapers.DeKalb County
> (Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb
> 
> County.fast(OCoLC)fst01215288United
> StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn
> 89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i
> 450050263311m o d cr cn|||||||||020730c19979999alu x neo
> 0 a0eng c
> 2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU using 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham
> 
> weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL
> :Birmingham Weekly1 online resourceIrregular,Feb. 16-28,
> 2012-Weekly,Sept. 4-11, 1997-Feb. 9-16,
> 2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan
> with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views &
> entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in
> print.Description based on: Publication information from ProQuest; title
> from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20,
> 2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic
> journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United
> 
> StatesAlabamaBirmingham.Print version:Birmingham
> Weekly(OCoLC)39271050http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn
> 94003083
> NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast
> shopperSoutheast shopper.Juneau, Alaska :Kemper
> Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol.
> 
> 1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau
> (Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United
> 
> StatesAlaskaJuneau.AU using 000011356572info:srw/schema/1/marcxmlxml00000cas
> a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn
> 93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt
> City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham
> tribune.Birmingham, Ala. :Kervin
> Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB:
> 
> publication expected Jan.
> 1995AU using 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450026199931920716d19922013alumr ne 0 a0eng csn 92003357
> NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215,
> Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black &
> white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala.
> :Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept.
> 1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New City
> paper.\"Description based on: June 1992.Latest issue consulted: No. 67
> (Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas
> a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn
> 95068755
> MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU using 000011579542nsdppccn-us-alF335.J5S68The
> 
> Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v.
> :ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The
> monthly newspaper of Alabama's Jewish community.\"Some issues also
> available on the Internet via the World Wide Web.Description based on:
> Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish
> newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United
> StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn
> 99018499(OCoLC)42431704CLUhttp://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn
> 90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc.,
> Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe
> Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no.
> 1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest
> issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United
> StatesAlabamaElmoreEclectic.AU using 000040212446info:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn
> 90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton
> Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL
> 35045nsdppccn-us-alThe Clanton advertiserThe Clanton
> advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58
> cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began in
> Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4,
> 1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United
> StatesAlabamaChiltonClanton.Independent advertiser (Clanton,
> Ala.)(OCoLC)21214732AU using 000025908452info:srw/schema/1/marcxmlxml00000cas
> a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn
> 90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount
> Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL
> 35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala.
> :Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3,
> 1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1,
> no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United
> StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn
> 85044741(OCoLC)12038577AU using 000025884049info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn
> 90099011
> AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe
> Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L.
> Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United
> StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville
> tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a
> 450021265218900326c19909999aluwr ne 0 0eng dsn 90099005
> AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike
> Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United
> StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn
> 90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha
> Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United
> StatesAlabamaCalhounWeaver.United
> StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn
> 87050045
> AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU using 000020456714360980USPSThe
> 
> Advertiser, P.O. Box 1000, Montgomery, AL
> 36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. :
> 1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery
> advertiser & the Alabama journalSunday Montgomery advertiserMontgomery,
> Ala. :Advertiser Co.,1987-volumes
> :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th
> 
> year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined
> edition is published with the Alabama journal, and called: Montgomery
> advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal
> and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday
> called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday,
> Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25,
> 1990.Montgomery
> (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
> 
> StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery,
> Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery,
> Ala. : 1940)0745-323X(DLC)sn
> 87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a
> 450016942287871105c19879999aludn ne 0 a0eng dsn 88050149
> AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy
> Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger
> (Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy
> Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no.
> 166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest
> issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2,
> 1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn
> 83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450017799786880415c19879999aluir ne 0 a0eng dsn 88050086
> AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe
> Prattville Progress, 152 W. 3rd St., Prattville, AL
> 36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville
> progress(Prattville, Ala.)The Prattville progress.Prattville, Ala.
> :James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20,
> 1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26,
> 1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville,
> Ala.)0745-7596(DLC)sn
> 83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284
> NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald,
> P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens
> County herald.Pickens County herald and west AlabamianCarrollton, Ala.
> :Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2,
> 1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and
> west Alabamian0746-0473(DLC)sn
> 83008141AU using 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450018917586881217c19869999aluwr ne 0 0eng dsn 88050225
> CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala.
> :[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy
> Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford
> sun (Oxford, Ala.)(DLC)sn
> 85045023AU using 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450013991168860731c19869999aluwr ne 0 0eng dsn 86050322
> CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton,
> Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19,
> 1986)-United
> StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn
> 88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont,
> Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala.
> :Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes
> published as: Journal independent.United
> StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn
> 85045014info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014
> CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala.
> :Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58
> cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3,
> no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same
> vol. numbering as the Piedmont journal-independent.United
> StatesAlabamaCalhounPiedmont.Piedmont
> journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent
> (Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn
> 85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P.
> Newspapers, Inc.,1983-volumes :illustrations ;58
> cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114,
> no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence
> times + tri-cities daily(DLC)sn
> 85044995info:srw/schema/1/marcxmlxml00000cas a22000007a
> 45009428489830420d19831987aluir ne 0 a0eng dsn 83007623
> NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd
> St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The
> Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville
> Progress,1983-1987.volumes :illustrations ;58 cmThree times a
> weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no.
> 32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United
> StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn
> 85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn
> 88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000
> a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052
> AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers,
> Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals
> edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence,
> Ala. :T.S.P. Newspapersvolumes
> :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
> 
> with: Vol. 114, no. 226 (Aug. 14,
> 1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and
> Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346
> (Monday, Dec. 12, 1983).United
> StatesAlabamaLauderdaleFlorence.TimesDaily (Regional
> edition)0743-152XTimes Tri-cities dailyUnknownDec. 12,
> 1983info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450010536023840319c19839999aludr ne 0 a0eng dsn 84008051
> NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc.,
> 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional
> edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional
> ed.Florence, Ala. :T.S.P.
> NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114,
> no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on
> Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12,
> 1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals
> edition)0743-1511Times Tri-cities dailyDec. 12,
> 1983AU using 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a
> 45009049482821213d19821987aludn ne 0 a0eng csn 82008412
> AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser
> 
> (Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama
> journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes
> :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th
> 
> year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays,
> Sundays and holidays published as: The Alabama journal and advertiser,
> Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have
> their own numbering.Montgomery
> (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
> 
> StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery,
> Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser
> (Montgomery, Ala. : 1987)0892-4457(DLC)sn
> 87050045(OCoLC)15155895AU using 000020281746info:srw/schema/1/marcxmlxml00000cas
> a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn
> 86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David
> S. Stevenson,1982-volumes :illustrations ;58
> cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91,
> no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke
> leader(DLC)sn 86050137Randolph press(DLC)sn
> 86050138info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013
> CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont
> Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe
> Piedmont journal-independentThe Piedmont journal-independent.Piedmont,
> Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes
> :illustrations ;58
> cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1,
> no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue
> consulted: Vol. 5, no. 31 (August 20, 1986).United
> StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn
> 85045012Journal-independent(DLC)sn
> 85045014(OCoLC)12715821AU using 000045312916info:srw/schema/1/marcxmlxml00000cas
> a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn
> 85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4,
> Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL
> 36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast
> sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST
> Publicationsvolumes :illustrations ;58
> cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
> 1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue
> consulted: Vol. 16, no. 43 (Mar. 4, 1998).United
> StatesAlabamaCoffeeEnterprise.AU using 000025827687info:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn
> 85044906
> AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe
> 
> New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew
> times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile,
> Ala. :New Times Groupvolumes
> :illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
> 
> in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec.
> 22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21,
> 1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African
> AmericansAlabamaNewspapers.African
> Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United
> 
> StatesAlabamaMobileMobile.AAPUnknownAug. 15,
> 1985AU using 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450018922463881219d19811983alucr ne 0 0eng dsn 88050233
> AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga
> dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A.
> Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except
> Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat. &
> Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th
> year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The
> Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as:
> Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published
> as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg
> star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily
> home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0
> 0eng dsn 90099002
> AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU using 000020585756mscn-us-alSpeakin'
> 
> out news.Speaking out newsDecatur, Ala. :Minority Network,
> Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also
> issued by subscription via the World Wide Web.Description based on: Vol.
> 7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African
> American
> newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
> 
> American newspapers.fast(OCoLC)fst00799278African
> Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
> StatesAlabamaMorganDecatur.United
> StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn
> 88050097http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn
> 86050472
> AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama
> gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub.
> Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th
> 
> year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette
> (Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas
> a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn
> 86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala.
> :Geneva Publications,1980-volumes :illustrations ;57-59
> cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80,
> no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald
> (Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn
> 88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out
> weekly news.Decatur, Ala. :Smothers PublicationsPublished every first
> and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May
> 4-17, 1983).African
> AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
> Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
> StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn
> 87050012Speakin' out news(DLC)sn
> 90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a
> 450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001
> AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave.,
> Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST
> Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28
> (Wed., Feb. 17, 1988).United
> StatesAlabamaDaleDaleville.AU using 000020585749info:srw/schema/1/marcxmlxml00000cas
> 
> a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn
> 87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala.
> :Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2,
> no. 10 (Mar. 12, 1987).United
> StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221
> NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle,
> PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County
> eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn
> bulletin and the Lee County eagleAuburn, Ala. :[publisher not
> identified]Semiweekly,<Sept. 5,
> 1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
> Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn
> 89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5,
> 1984info:srw/schema/1/marcxmlxml00000cas a22000007a
> 450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147
> CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City
> times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2,
> no. 24 (Jan. 6, 1982).United
> StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn
> 83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub.
> Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St.
> Clair clarion.Saint Clair clarionSpringville, AL :Gary L.
> ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
> Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt.
> ClairSpringville.AU using 000025783743info:srw/schema/1/marcxmlxml00000cas
> a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn
> 86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O.
> Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western
> star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal
> HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
> Vol. 3, no. 15 (Wednesday, June 11, 1986).United
> StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn
> 87050117AU using 000025805174511.1srw.pc any \"y\" and srw.mt any
> \"newspaper\" and srw.cp exact
> \"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull"
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 27 2022, 09:22am via System
> 
> Hello Spencer,
> 
> Thank you for reaching out about the bulk xml files for the US Newspaper
> Directory.
> 
> We don't have documentation specific to these bulk xml files, but upon
> further inspection I can say that each of those files don't necessarily
> contain info for 50 newspaper titles. The structure of the titles for
> California and New York for instance are different from say, Alabama.
> 
> If you look at California for example, the file naming structure
> indicates the year the title started, and then the number of titles
> included in that xml file. So for instance, the files below include info
> for newspapers that started in 2000, 2001, and 2002 respectively. And
> there is info for 30 titles in the xml file from 2000, and 14 in the
> file for 2001, and so on.
> 
>     * ndnp_California_2000_e_0001_0030.xml
>     * ndnp_California_2001_e_0001_0014.xml
>     * ndnp_California_2002_e_0001_0012.xml
> 
> If there's more than 50 titles for a given year, say for California
> starting in 1880, then the next 50 titles will roll into the next xml
> file, and so on. And the last xml file for that year may not include 50
> titles.
> 
> Many of the states seem to group all the years together, so each xml
> file contains 50 titles, until possibly the last one for a given state,
> which may contain less.
> 
> I hope this information helps explain the total number of records and
> structure a bit better. Let me know if you have any further questions.
> 
> Best wishes,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 25 2022, 02:22pm via Email
> 
> Hi, Kerry:
> 
> 
> Might there be documentation on the XML files you mentioned?
> 
> 
> I've successfully read
> 'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/',
> extracted the names of 6666 XML files, and read the first one,
> "ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters,
> beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
> 45001030438981180404c20159999aluwr n 0 a0eng ". With a bit
> more effort, I will likely be able to parse all 6666 of these. The
> names suggest that each contains information on 50 newspapers, totaling
> 333,300. The main page
> "https://chroniclingamerica.loc.gov/search/titles/" says there are only
> 157,521 "Titles currently listed". This suggests that these XML files
> include place holders for a little more than double the number of
> entries currently in "https://chroniclingamerica.loc.gov/search/titles/".
> 
> 
> Thanks for this.
> 
> 
> Progress.
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 07 2022, 08:55am via System
> 
> Hi Spencer,
> 
> I thought of one more option after I emailed you yesterday that I wanted
> to make you aware of.
> 
> I had explained the other day how we pull the records from OCLC into our
> U.S. Newspaper Directory. You can also access all of the raw MARC
> records found in the directory in xml format from here if you choose:
> https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/
> <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/> These will
> provide you all of the data from the record fields in MARC format, so
> you'd get all the data you see here for example:
> https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/
> <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/> but in xml. I
> don't know if this might be more data and info than you want to work
> with, but wanted to make sure you were aware of this option as well.
> 
> Best wishes,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 06 2022, 10:55am via System
> 
> Hi Spencer,
> 
> Thanks for reaching out again. I have been looking at the json view a
> bit closer this morning and your example of "9999."
> 
> After talking with a colleague this morning and looking at various
> examples, I see there is some variation in how the titles with either an
> unknown starting/ending date or currently published titles are being
> handled - depending on the view.
> 
> As an example, I completed a search in the directory for Alaska and the
> city of Anchorage. There are 80 results, and on the first page of
> results you'll see # 4. Fort Richardson news, which was published from
> 1952-19??. The csv view of this state/city search result will show the
> ending date of 19??. But if I append &format=json to this search result,
> this specific title will show an ending date of 1999. After talking with
> a colleague this morning, I discovered an integer had to be used in
> these cases where dates were "?" so that the search based on year range
> would work. Similarly, if you look at # 12 Alaska digest, which was
> published 1994-current, the "current" becomes "9999" in the json view.
> So, the records you are seeing with "9999" would most likely be titles
> with an ending date of "current."
> 
> However, there is an issue with the unknown dates, like "1999" being
> used for "19??" in the example above. The "9" does not get inserted in
> place of "?" when you are looking at the title/LCCN view of a specific
> newspaper. So for instance, if you view the #4 title: Fort Richardson
> news at this url: https://chroniclingamerica.loc.gov/lccn/sn98059792/
> <https://chroniclingamerica.loc.gov/lccn/sn98059792/> but append .json
> to the end of the url, after the LCCN, like this:
> https://chroniclingamerica.loc.gov/lccn/sn98059792.json
> <https://chroniclingamerica.loc.gov/lccn/sn98059792.json> you'll see
> that the end_year is "19??." Viewing the title/LCCN json view for titles
> that are currently published will also show the end_year as "current."
> The Alaska digest example from above can be viewed here:
> https://chroniclingamerica.loc.gov/lccn/sn97060056.json
> <https://chroniclingamerica.loc.gov/lccn/sn97060056.json>
> 
> I wasn't aware of the difference between the directory search json view
> and the title/LCCN view. But I think it would be possible to grab
> the data from the title/LCCN json url through an additional script
> potentially. The json url is included in the view under the "url" field.
> 
> Of course, there are unknowns with publishing dates, but better to know
> where the question marks are, and what titles are considered to be current.
> 
> I hope this clarifies the data a bit more - let me know if any of it
> needs more clarification though. And let me know if you have follow-up
> questions.
> 
> Thank you,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 05 2022, 04:42pm via Email
> 
> Hi, Kerry:
> 
> 
> What would you suggest I do to get a count of the numbers of
> newspapers and publishers operating by year from, say, 1790 to 2021?
> 
> 
> I just determined that 20630 (13 percent) of the 157520 records in
> the US Newspaper database I downloaded a week ago have end_year = 9999.
> I don't think it's feasible to assume that all or even most of those
> are still publishing.
> 
> 
> Might there be some other database that might have this kind of
> information?
> 
> 
> I ask, because Robert McChesney (2004) The Problem of the Media
> (Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of
> the nineteenth century, the US had more newspapers and newspaper
> publishers per capita than any other place or time. He suggests that
> that diversity of newspapers helped encourage literacy and limit
> political corruption, both of which helped propel the young US to its
> current dominance of the international political economy. I'm hoping to
> get some data to evaluate this claim. Sadly, it looks like there is too
> much missing and questionable data in this dataset for me to use this
> without a fairly substantive data cleaning effort.
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 05 2022, 09:05am via System
> 
> Hello Spencer,
> 
> Thank you for reaching out about your additional questions.
> 
> I was looking at the records you mention above, and yes, you are correct
> - those 9 records with the date inconsistencies and the one record for
> the The New Mexican mining news
> <https://chroniclingamerica.loc.gov/lccn/sn93061507/> containing "Santa
> Fe.\" have typos in them. Thanks for spotting these - it may be possible
> to have the cataloger in our division correct those typos. I will look
> into this further.
> 
> The U.S. Newspaper Directory doesn't have a connection with Wikimedia or
> Wikipedia. The Library of Congress periodically pulls the records for
> the Directory from OCLC Worldcat
> <https://www.oclc.org/en/worldcat.html>. And those newspaper records in
> OCLC Worldcat have been created by catalogers at various institutions
> around the U.S. over the span of several years. So, occasionally, you
> will find a typo in the records. Corrections can be made by OCLC and
> library staff at the various institutions. Every time we complete a new
> pull on the OCLC records, any corrected records will then populate our
> Directory.
> 
> Regarding your question on the New-York weekly journal - yes, that is
> also correct that it has two records. There is actually a record for
> each format of the newspaper, so this record is for the microfilm format
> <https://chroniclingamerica.loc.gov/lccn/2009252748/> and this one is
> for the original print format
> <https://chroniclingamerica.loc.gov/lccn/sn83030211/>. You can see in
> the heading for the microfilm record where it says [microfilm reel] and
> the print version shows [volume]. You are likely to see this for other
> titles as well because each format has been cataloged with its own LCCN.
> You are also likely to see additional records with [online resource]
> identified as the format as more and more titles are available as
> ePrints or online.
> 
> I hope this helps answer your additional questions a bit more. Please
> reach out if you have any other questions.
> 
> Thank you,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 04 2022, 01:47pm via Email
> 
> Hi, Kelly:
> 
> 
> At the risk of bombing your inbox with more emails than you want,
> what is your relationship with Wikipedia and other Wikimedia Foundation
> projects like Wikidata?
> 
> 
> I ask, because I've logged over 20,000 edits in Wikimedia Foundation
> projects since 2010, and I would happily try to answer questions about
> Wikidata and other Wikimedia Foundation projects. I have NOT organized
> an edit-a-thon, but I've made presentations at conferences with people
> who have, and I would happily try to help organize such if you could
> find a group of people who want to work to improve this US Newspaper
> database. I think it would be good to establish links between this US
> Newspaper database and Wikidata, with appropriate procedures so changes
> to one could be evaluated for acceptance into the other.
> 
> 
> FYI, John Peter Zenger's famous "New-York weekly journal" (1733-1751)
> appears TWICE in your database with lccn = 2009252748 and sn83030211 and
> ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items
> have an lccn. See:
> 
> 
> https://www.wikidata.org/wiki/Q23091960
> 
> 
> There's a "WikiProject Newspapers" on Wikipedia and a companion
> "WikiProject Periodicals" on Wikidata:
> 
> 
> https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata
> 
> 
> https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals
> 
> 
> I've tried to connect with others on those projects, so far with only
> limited success. However, you may know that almost anyone can change
> almost anything on Wikipedia and other Wikimedia Foundation projects.
> What stays tends to be written from a neutral point of view citing
> credible sources. They have problems with vandals, but the problems are
> usually easily controlled. This makes Wikipedia and Wikidata very
> useful platforms for cleaning up databases like your US Newspaper dataset.
> 
> 
> Spencer Graves
> 
> 
> ##########
> 
> 
> Hello, Kelly:
> 
> 
> In addition to the invalid JSON, discussed below [NOTE: The "below"
> contains a slight addition to the report of the I sent last Friday.], I
> found 9 (NINE!) cases where start_year was AFTER end_year. These have
> lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926"
> "sn99065409" "sn89065002" "sn98069857" "sn91059179"
> 
> 
> See:
> 
> 
> https://chroniclingamerica.loc.gov/lccn/sn86071531/
> https://chroniclingamerica.loc.gov/lccn/sn95069213/
> https://chroniclingamerica.loc.gov/lccn/sn90059096/
> https://chroniclingamerica.loc.gov/lccn/sn86058451/
> https://chroniclingamerica.loc.gov/lccn/sn90060926/
> https://chroniclingamerica.loc.gov/lccn/sn99065409/
> https://chroniclingamerica.loc.gov/lccn/sn89065002/
> https://chroniclingamerica.loc.gov/lccn/sn98069857/
> https://chroniclingamerica.loc.gov/lccn/sn91059179/
> 
> 
> These all have obvious coding errors that can be easily fixed. The
> data may not be completely accurate after the fix, but at least they are
> not obviously wrong ;-)
> 
> 
> ##################
> 
> I got invalid JSON from:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
> 
> 
> After some experimentation, I was able to replicate the problem with
> a request for rows=10:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
> 
> 
> Duncan Temple Lang <dtemplelang using ucdavis.edu>, Professor of Statistics
> and Associate Dean for Graduate Programs at the University of California
> - Davis, confirmed that it was a JSON error using:
> 
> 
> https://codebeautify.org/jsonvalidator
> 
> 
> He is part of the core team developing the R free, open-source
> programming language. He said, that starting at offsets 161070 and
> 161502 in the character string you get from [the R code RCurl::getURL()]
> we have:
> 
> 
> Santa Fe.\"
> 
> 
> and these are in an entry such as
> 
> 
> "city": ["Santa Fe.\"]
> 
> 
> So the final " is escaped and therefore there is no closing " for the
> string. The parser continues to consume characters looking for the end
> of that string.
> 
> 
> If one "repairs" the text from getURL() with
> 
> 
> ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
> 
> 
> then the rest of my code worked fine.
> 
> 
> You may wish to do something to implement other checks for valid JSON
> and repair this problem. I've scanned all the 157520 records that were
> in that database a couple of days ago, and this is the only JSON error
> identified by the code I used.
> 
> 
> NOTE: I was NOT able to replicate this error when downloading records
> one at a time. That suggests a problem NOT in the database itself but
> in the download algorithm. ???
> 
> 
> Thank you for your help. I will almost certainly have other
> questions ;-)
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 03 2022, 10:39pm via Email
> 
> Hello, Kelly:
> 
> 
> In addition to the invalid JSON, discussed below [NOTE: The "below"
> contains a slight addition to the report of the I sent last Friday.], I
> found 9 (NINE!) cases where start_year was AFTER end_year. These have
> lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926"
> "sn99065409" "sn89065002" "sn98069857" "sn91059179"
> 
> 
> See:
> 
> 
> https://chroniclingamerica.loc.gov/lccn/sn86071531/
> https://chroniclingamerica.loc.gov/lccn/sn95069213/
> https://chroniclingamerica.loc.gov/lccn/sn90059096/
> https://chroniclingamerica.loc.gov/lccn/sn86058451/
> https://chroniclingamerica.loc.gov/lccn/sn90060926/
> https://chroniclingamerica.loc.gov/lccn/sn99065409/
> https://chroniclingamerica.loc.gov/lccn/sn89065002/
> https://chroniclingamerica.loc.gov/lccn/sn98069857/
> https://chroniclingamerica.loc.gov/lccn/sn91059179/
> 
> 
> These all have obvious coding errors that can be easily fixed. The
> data may not be completely accurate after the fix, but at least they are
> not obviously wrong ;-)
> 
> 
> ##################
> 
> I got invalid JSON from:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
> 
> 
> After some experimentation, I was able to replicate the problem with
> a request for rows=10:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
> 
> 
> Duncan Temple Lang <dtemplelang using ucdavis.edu>, Professor of Statistics
> and Associate Dean for Graduate Programs at the University of California
> - Davis, confirmed that it was a JSON error using:
> 
> 
> https://codebeautify.org/jsonvalidator
> 
> 
> He is part of the core team developing the R free, open-source
> programming language. He said, that starting at offsets 161070 and
> 161502 in the character string you get from [the R code RCurl::getURL()]
> we have:
> 
> 
> Santa Fe.\"
> 
> 
> and these are in an entry such as
> 
> 
> "city": ["Santa Fe.\"]
> 
> 
> So the final " is escaped and therefore there is no closing " for the
> string. The parser continues to consume characters looking for the end
> of that string.
> 
> 
> If one "repairs" the text from getURL() with
> 
> 
> ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
> 
> 
> then the rest of my code worked fine.
> 
> 
> You may wish to do something to implement other checks for valid JSON
> and repair this problem. I've scanned all the 157520 records that were
> in that database a couple of days ago, and this is the only JSON error
> identified by the code I used.
> 
> 
> NOTE: I was NOT able to replicate this error when downloading records
> one at a time. That suggests a problem NOT in the database itself but
> in the download algorithm. ???
> 
> 
> Thank you for your help. I will almost certainly have other
> questions ;-)
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jul 01 2022, 11:46am via Email
> 
> Hello, Kelly:
> 
> 
> I got invalid JSON from:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
> 
> 
> After some experimentation, I was able to replicate the problem with
> a request for rows=10:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
> 
> 
> Duncan Temple Lang <dtemplelang using ucdavis.edu>, Professor of Statistics
> and Associate Dean for Graduate Programs at the University of California
> - Davis, confirmed that it was a JSON error using:
> 
> 
> https://codebeautify.org/jsonvalidator
> 
> 
> He is part of the core team developing the R free, open-source
> programming language. He said, that starting at offsets 161070 and
> 161502 in the character string you get from [the R code RCurl::getURL()]
> we have:
> 
> 
> Santa Fe.\"
> 
> 
> and these are in an entry such as
> 
> 
> "city": ["Santa Fe.\"]
> 
> 
> So the final " is escaped and therefore there is no closing " for the
> string. The parser continues to consume characters looking for the end
> of that string.
> 
> 
> If one "repairs" the text from getURL() with
> 
> 
> ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
> 
> 
> then the rest of my code worked fine.
> 
> 
> You may wish to do something to implement other checks for valid JSON
> and repair this problem. I've scanned all the 157520 records that were
> in that database a couple of days ago, and this is the only JSON error
> identified by the code I used.
> 
> 
> Thank you for your help. I will almost certainly have other
> questions ;-)
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 28 2022, 02:20pm via System
> 
> Hello Spencer,
> 
> Thank you for sending along your follow-up questions.
> 
> I'm glad to hear the json view will work for you. It was recommended to
> me that you limit your requests to 500 rows at a time. And a developer
> here at LC suggests the following regarding rate limiting:
> 
> “To avoid being blocked by the server, the current rate-limiting rules
> restrict un-cached requests to URLs starting with
> https://chroniclingamerica.loc.gov/search/
> <https://chroniclingamerica.loc.gov/search/> to 120 requests every 10
> minutes from a single IP address.”
> 
> So, I think if you limited each of your requests to 500 rows at a time
> with the proper pauses, then you should be able to access what you need.
> 
> As for the csv view, I checked on this as well, and was informed that
> the csv view was not implemented for all url formats. The csv view was
> only implemented for this view:
> https://chroniclingamerica.loc.gov/newspapers/
> <https://chroniclingamerica.loc.gov/newspapers/>and urls resulting from
> US Directory search results - for e.g. if you wanted to narrow down your
> search results by state, city, date range, etc. found at this link:
> https://chroniclingamerica.loc.gov/search/titles/
> <https://chroniclingamerica.loc.gov/search/titles/>. So, if you wanted a
> csv and limited your search by state ( for example:
> https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv
> <https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv>
> ), you could append &format=csv to the search result url and get the csv
> to automatically download. But, if your search results ended up being
> over a couple thousand titles, then the system would probably time out.
> 
> I hope this info helps! Let me know if you have any other questions.
> 
> Best wishes,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 27 2022, 04:15pm via Email
> 
> Hello, Kerry:
> 
> 
> Thanks for the reply. Can you please give me some further guidance
> on two thing "so that the system is not overwhelmed"?
> 
> 
> 1. The max size in a small batch?
> 
> 
> 2. Any limit on the number of small batches in a second or minute?
> 
> 
> I've found that I can download small batches under program control
> using "RCurl::getURL" in R (programming language) using, e.g.;
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json
> 
> 
> With this, I can control the batch size with "row=20" vs. "row=50"
> vs., e.g., "row=1000". A naive search says there are 157520 "results".
> With "row=1000", this would require 158 calls. With "row=20", it
> would require 7876 calls. Before I start, I need to decide which fields
> I want; I don't need them all.
> 
> 
> Thanks,
> Spencer Graves
> 
> 
> p.s. I tried appending "&format=csv" and got "Error 504 Ray ID:
> 7220896da85e86e7 • 2022-06-27 19:19:53 UTC Gateway time-out". I used:
> 
> 
> https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv
> 
> 
> I can get what I want using json so do not need csv. However, I
> thought you might want to know that I was unable to get csv to work.
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 27 2022, 10:54am via System
> 
> Hello Spencer,
> 
> Thank you for contacting the Library of Congress about searching the US
> Newspaper Directory. I wanted to follow up with you regarding your
> request to output the data in a machine readable format.
> 
> It looks like you were provided the link to the API documentation for
> the website: About the Site and API
> <https://chroniclingamerica.loc.gov/about/api/>. Scroll down to the
> section with the heading, Searching the directory and newspaper pages
> using OpenSearch. This section describes the search functionality and
> structure for the US Newspaper Directory in more detail. It is possible
> to return your directory searches in json format by appending
> &format=json to the end of the url. It is also possible to return search
> results in csv format by appending &format=csv to the end of the url,
> but I would strongly suggest that you do this in small batches by
> putting limits on your search so that the system is not overwhelmed.
> 
> So, from the search page for the US Newspaper Directory
> <https://chroniclingamerica.loc.gov/search/titles/> you could
> potentially limit your search based on state and city, or date range,
> and/or even frequency. Then once you've completed the search, you can
> add &format=csv to the end of the url to automatically download a csv of
> those records. The resulting csv will contain several fields/headers:
> lccn, title, place of publication, start year, end year, publisher,
> edition, frequency, subject, state, city, country, language, oclc
> number, and holding type. I think these fields include the information
> you were looking for. But, again, I would like to stress that you put
> limits on your search before creating the csv so as not overwhelm the
> system.
> 
> Please let me know if you have any other additional questions.
> 
> Best wishes,
> 
> Kerry Huller
> Newspaper & Current Periodical Reading Room
> Serial & Government Publications Division
> Library of Congress
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 23 2022, 01:55pm via System
> 
> Mr. Graves,
> 
> I'm going to transfer you request to a member of our digital collections
> team who may be of more assistance to you than me.
> 
> Mike
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 23 2022, 01:51pm via Email
> 
> Dear Mr. Queen:
> 
> 
> Thanks for the reply. I'm still confused. I downloaded and
> installed Docker Desktop and "docker-compose.yml" and ran their "Getting
> Started" Tutorial, but I don't see what to do next.
> 
> 
> I repeat: I'd like to analyze "U.S. Newspaper Directory,
> 1690-Present" (https://chroniclingamerica.loc.gov/search/titles/), which
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 22 2022, 07:15pm via System
> 
> Mr. Graves,
> 
> Programmatic access to the data forChronicling America
> <https://chroniclingamerica.loc.gov/>and possibly the U.S. Newspaper
> Directory <https://chroniclingamerica.loc.gov/search/titles/>can be
> found on theAbout the Site and API
> <https://chroniclingamerica.loc.gov/about/api/>page in various formats.
> Also, please note that Chronicling Americacontains newspapers published
> from 1777-1963, but does not include everyU.S. newspaper published in
> that time period.
> 
> Please let me know if I can be of further assistance.
> 
> 
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 22 2022, 06:14pm via Email
> 
> Dear Mr. Queen:
> 
> 
> Can we simplify this to just giving me the data behind "U.S.
> Newspaper Directory, 1690-Present"
> (https://chroniclingamerica.loc.gov/search/titles/) in a machine
> readable format, e.g., csv or xlsx or a MySQL database?
> 
> 
> As I mentioned in my original email, a naive search of that without
> restrictions returned 157520 titles in 7876 pages with up to 20 titles
> per page giving date ranges in at least some cases. I could probably
> write software to scrape those 7876 pages from your web site and combine
> them into a data file.
> 
> 
> I have a PhD in statistics, I have been using the R programming
> language and similar software for decades. This includes publishing
> tutorials on how to analyze data like this on Wikiversity.[1] I'd like
> to do something similar with this. I could help make your data more
> useful to others and discuss with you how we might prioritize
> improvements like accessing the other sources you mentioned.
> 
> 
> Thanks very much for your reply.
> 
> 
> Sincerely,
> Spencer Graves, PhD
> Founder, EffectiveDefense.org
> 4550 Warwick Blvd 508
> Kansas City, MO 64111
> m: 408-655-4567
> 
> 
> [1] e.g.:
> 
> 
> https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita
> ------------------------------------------------------------------------
> 
> Newspapers and Current Periodicals Reference Librarian
> 
> Jun 22 2022, 05:27pm via System
> 
> Mr. Graves
> 
> Your request is a little more complex than it first appears and requires
> extensive research. A variety of resources should be consulted to
> determine the circulation statistics of newspapers published prior to
> 1851. You will need to check newspaper union lists and newspaper
> histories. Union listspresent lists of newspapers in geographic
> arrangement according to place of publication, and specify which
> libraries or other institutions hold collections of those newspapers and
> the dates of their holdings. These can also be useful for tracking title
> changes throughout a newspaper's history. Newspaper
> historieslikeAmerican Journalism: A History: 1690-1960
> <https://lccn.loc.gov/62007157>(Mott),The Penny Press
> <https://lccn.loc.gov/2004043078>(Thompson), andThe Press and America
> <https://lccn.loc.gov/99044295>(Emery et al.) may not include
> circulation statistics, but they do document the diversity and progress
> of newspaper publishing, including notable newspapers of the era.
> Newspaper histories also cover the history of the printers and printing
> of newspapers in a state, county, or region more generally, and provide
> more condensed histories of the editors, journalists, and evolution of
> the newspapers in a specific area. Newspaper histories and union lists
> should be available at most large public or university libraries. More
> information about union lists, newspaper histories, and researching
> newspapers in general can be found in theU.S. Newspaper Collections at
> the Library of Congress
> <https://guides.loc.gov/united-states-newspapers/introduction>research
> guide (see Reference Sources).
> 
> Please let me know if I can be of further assistance.
> 
> ------------------------------------------------------------------------
> 
> Original Question
> 
> Jun 20 2022, 02:34pm via System
> 
> How can I get counts of the numbers of newspapers by year in the US, and
> preferably also elsewhere? A search of "U.S. Newspaper Directory,
> How can I get counts of the numbers of newspapers by year in the US, and
> preferably also elsewhere?
> 
> A search of "U.S. Newspaper Directory, 1690-Present"
> (https://chroniclingamerica.loc.gov/search/titles/) returned 157520
> titles in 7876 pages with up to 20 titles per page giving date ranges to
> the extent that it's known. If I can get a data file (e.g., csv or xls),
> I can summarize. I could also use data on circulation and frequency and
> especially parent company for multiple newspapers published by the same
> company, to the extant that such is available.
> 
> I'm interested in this, because McChesney quoted Tocqueville in
> suggesting that the US had more newspapers per person (or per million
> population) prior to 1851 than at any other time or place in history.
> I'd like to evaluate that claim with data to the extent that I can. See
> "https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present".
> 
> 
> Thanks, Spencer Graves, PhD
> m: 408-655-4567
> 
> ------------------------------------------------------------------------
> 
> Thank you for using Newspapers & Current Periodicals Ask a Librarian
> Service!
> 
> 
> This email is sent from Ask a Librarian in relationship to ticket #9625195.
> 
> Read our privacy policy. <https://springshare.com/privacy.html>
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list