Bible Studies. org.uk

Some Thoughts about Archive.org Digital Texts" from "BiblicalStudies.org.uk"

A few months ago I was asked to undertake a project for Tyndale House which involved searching through their catalogue for out-of-copyright books and try to link these with electronic versions already on-line. This naturally brought me to archive.org to search through its massive collection of on-line texts. On the basis of that experience I thought that it might be helpful to write down a few thoughts on the subject that may spark a discussion.

First of all, here are some of many positive features of archive.org texts:


  1. There is a huge amount of material available. Well over 90% of the 450 titles I searched for were already there.
  2. This material can be downloaded in a wide range of formats, including, PDF, DJVU, TEXT, HTML  and Kindle compatible files.
  3. The site is supported by an enthusiastic user-base who are constantly adding new material.
Now, some issues that need to be considered.
  1. Some books that are still under copyright in the UK because they were printed there are listed as being in the Public Domain on archive.org because it is hosted in the United States. In order to prevent them being downloaded outside the US Google Books (linked from  archive.org) has blocked non-US IP addresses from accessing them - which of course can always be circumvented using a US-based proxy.
  2. Some material that is in the Public Domain in the UK is being blocked by Google Books..
  3. The first two points serve as a reminder that users cannot rely on the accuracy of the copyright declaration on the site outside of the US - you need to double check everything.
  4. Some scans are incomplete and/or of poor quality.
  5. Scans to PDF are often very large files. By reprocessing the files it is possible to reduce the file size by 50% in one trial I conducted.
  6. The search facility is fine if you know the exact title of the work you are after. However, if you misspell it or get a word wrong then the book you are after will not appear in the results.
  7. Perhaps as a result of (6) the usage statistics listed next to certain titles showing the number of downloads are often surprisingly low.
Please "weigh" rather than just "count" the points above, as the benefits of the site far outweigh the negative issues. For me they indicate a number of opportunities to make this work further:
  • Important UK-published theological books in the Public Domain could be re-scanned and hosted so as to avoid the unnecessary blocks on accessing them.
  • Poor quality scans can be replaced.
  • When serving users on dial-up or slow access Internet connections there is scope for reprocessing selected works and hosting them elsewhere to reduce the file sizes.
  • The site lends itself to being linked with specialist bibliographies (such as those provided by the TheologyOnTheWeb sites) linked directly to material hosted on archive.org. This gets round the problem of searches when the material is not being blocked.
What has been your experience with using archive.org? Can you suggest any other ways in which the wealth of material there can be better used?
Rate0

Replies

  • First I want to welcome you to the group.  It is amazing how many folks are involved with one of Tyndal's projects, which are growing in number.  I have utilized archive for some research and your are right that the benefits outweigh the negatives.  Many of the books I desired to read were written in the 19th century or directly pertained to that century.  Once I located a certain book I check archive to see if it listed.  I use Lennox OS and the program I use to open files automatically reduces the file size when downloaded.  I don't have any real issues with "work-arounds" because of the nature of the file systems and the ability of my OS to avoid innate obstacles of Microsoft and Macintosh.  The only time copyright is an issue is when quoting or copying materials.  For my part, there is no restriction on reading any given book.  If the site lists a book as public domain, it is the site's legal concern if they have erred. That does not excuse misuse or other violations, but it does place the responsibility on the provider and not the user.  If I know that a certain book has been wrongly listed, and I have discovered a few, then i strictly abide with copyright provisions.  In some cases i have contacted the publisher to inform them and they have always responded with permission for me to use portions of the book - with only one exception.  Honesty is always the best policy. 

    One improvement i would recommend is grouping of books under certain subject areas.  I did this for the school library for which I am the librarian.  It takes time and effort, but once structured it goes pretty quickly.  I first established the different subject areas and established the folders. After that i downloaded the books and uploaded them to the folders under each subject area.  I used a search engine to search the library instead of the internet - an option offered by a few of the search engines.  It then became possible to locate books much faster.  But, I'm not sure how archive is set up. One may need to gain access to certain protocols or scripts to make such changes possible.

    Reply
Sorry but you must be a member of this group to reply to this topic.

This website is powered by Spruz

What Each Area of the Site is Meant For: Blogs - This is your personal space. This is where you should post thoughts that are not intended for extensive further discussion. Observations from personal study and events that have occurred in your life belong here. Unless your post ends with questions or makes it apparent that discussion is to follow, it should probably be a blog. As discussed later, blogs are limited to those who hold to historic Christian beliefs. Forums - This is for open discussion relating to the topic posted. Dialogue is encouraged to stay on topic, so if a side conversation begins, open a new discussion. This is where the majority of the activity has taken place so far. Topics should remain general in nature, while in depth discussion on narrow topics should take place in groups. Groups - This is a place to congregate with people who have similar interests and positions in order to have open discussion. The conversation in here is not required to remain on topic, so it is more ready to follow rabbit trails. This is where you should go if you want to gather with a particular kind of theologian. Before initiating a new group, we ask that you consider posting a question in the discussion forum area to see if there is enough interest to justify a separate group. The reason we encourage such action is that, in the event that a group is inactive for 6 months or more, the moderators of Scriptural Studies reserve the right to close down and delete the group due to inactivity. Events - This is available to anyone that wants to post an event that you think the members of Scriptural Studies may be interested in. Contact Denis, Rabbi Del, Rifkah, or Marti for more details on advertising. Our Attitude of conduct: In case you missed them on your way in, take some time to become acquainted with the conduct we expect on this site. You may find our Attidudes on the main forum page. Our purpose at Scripural Studies is that the conversations move in a Gracious way. We define Gracious in the following way: 1) Not closed minded 2) Not self-promoting 3) Not characterized by mass amounts of cut-and-paste proof-texting 4) Not characterized by mass amounts of cut-and-paste from other places 5) Irenic 6) Not slanderous 7) No spamming 8) Perpetual venting bitterness 9) Not confusing or disruptive But in all things you'll be welcome here