Results 1 to 14 of 14

Thread: Create browsable copy of a website on my hard drive (Tool)

  1. #1
    Senior Member retroborg's Avatar
    Join Date
    Aug 2003
    Posts
    679
    Thanks
    19
    Thanked
    19 times in 13 posts

    Create browsable copy of a website on my hard drive (Tool)

    Hello,
    I would like to make a browsable copy of a website on my hard drive in HTML, HTM, including all its subdirectory structure, files (text, graphics, sound), links (As well as links pointing to other servers) and most importantly, its search engine. (So that when I type something in the “search index” box, it will find the appropriate page from that website, stored in my hard drive)


    What would be the best tool/program for this task?

    Thanks in advance

  2. #2
    Network|Geek kidzer's Avatar
    Join Date
    Jul 2005
    Location
    Aberdeenshire
    Posts
    1,732
    Thanks
    91
    Thanked
    47 times in 42 posts
    • kidzer's system
      • Motherboard:
      • $motherboard
      • CPU:
      • Intel Q6600
      • Memory:
      • 4GB
      • Storage:
      • 1TiB Samsung
      • Graphics card(s):
      • BFG 8800GTS OC
      • PSU:
      • Antec Truepower
      • Case:
      • Antec P160
      • Operating System:
      • Windows 7
      • Monitor(s):
      • 20" Viewsonic
      • Internet:
      • ~3Mbps ADSL (TalkTalk Business)

    Re: Create browsable copy of a website on my hard drive (Tool)

    File -> Save As in your browser would save the current page to your HD.

    You're going to struggle to get it all on your HD though - the site says it has over 25,000 articles (so 1 page each right there I would expect), and the search function will probably run server-side code to search, so I dont expect you'll be able to get that.

    EDit: Ooh yes, what saracen said also
    Last edited by kidzer; 27-08-2007 at 01:25 PM.
    "If you're not on the edge, you're taking up too much room!"
    - me, 2005

  3. #3
    Admin (Ret'd)
    Join Date
    Jul 2003
    Posts
    18,481
    Thanks
    1,016
    Thanked
    3,208 times in 2,281 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    Retroborg, please be careful with this. Depending on what the site is, you're treading close to copyright issues with this. Even with a wiki site operating under a GFDL, there may (probably are) elements of most websites protected by copyright, including perhaps site design but certainly including some third-party elements, like advertising.

    I'm assuming that the site you wish to copy is one to which you either own the copyright, or have permission from the owner to do this. If not, we'd have to close this thread.

  4. #4
    Senior Member burble's Avatar
    Join Date
    May 2007
    Location
    Olney
    Posts
    1,138
    Thanks
    8
    Thanked
    90 times in 89 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    Assuming no copyright issues, grab a copy of wget and then do 'wget -r www.mysite.com' from a command prompt.

  5. #5
    Agent of the System ikonia's Avatar
    Join Date
    May 2004
    Location
    South West UK (Bath)
    Posts
    3,736
    Thanks
    39
    Thanked
    75 times in 56 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    that won't take the search engine or follow external links
    It is Inevitable.....


  6. #6
    lazy student nvening's Avatar
    Join Date
    Jan 2005
    Location
    London
    Posts
    4,656
    Thanks
    196
    Thanked
    31 times in 30 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    surely you would not want it to follow external links because you would end up with half the internet being stored on your hdd?? lol

    Also as said before i doubt very much you would be able to save the search feature as it would probably be running off some sort of database which would not be accessible without being the site admin
    (\__/)
    (='.'=)
    (")_(")

  7. #7
    Senior Member retroborg's Avatar
    Join Date
    Aug 2003
    Posts
    679
    Thanks
    19
    Thanked
    19 times in 13 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    Well, if I can't get the search engine it will be no good, as that’s what makes that site so cool and manageable.
    Will I be able to get the search engine with any of the following tools?

    HTTrack (Free)
    http://www.httrack.com/

    Offline Explorer Pro
    http://www.metaproducts.com/mp/Offline_Explorer_Pro.htm

    Offline commander
    http://www.zylox.com/

    wget (free)
    http://www.gnu.org/software/wget/#downloading

    Teleport pro / ultra (commercial)
    http://www.tenmax.com/teleport/pro/

    Wysigot Light
    http://www.snapfiles.com/reviews/Wys...t/wysigot.html

    Thanks in advance.

  8. #8
    Comfortably Numb directhex's Avatar
    Join Date
    Jul 2003
    Location
    /dev/urandom
    Posts
    17,074
    Thanks
    228
    Thanked
    1,027 times in 678 posts
    • directhex's system
      • Motherboard:
      • Asus ROG Strix B550-I Gaming
      • CPU:
      • Ryzen 5900x
      • Memory:
      • 64GB G.Skill Trident Z RGB
      • Storage:
      • 2TB Seagate Firecuda 520
      • Graphics card(s):
      • EVGA GeForce RTX 3080 XC3 Ultra
      • PSU:
      • EVGA SuperNOVA 850W G3
      • Case:
      • NZXT H210i
      • Operating System:
      • Ubuntu 20.04, Windows 10
      • Monitor(s):
      • LG 34GN850
      • Internet:
      • FIOS

    Re: Create browsable copy of a website on my hard drive (Tool)

    how do you intend to get a server-side search engine?

    look at it this way, click "view source" on this page - you think there's a file stored on disk on the hexus servers called "showthread.php?t=117034" with this exact thread in it? no. the page is generated on request, which is how search engines work

  9. #9
    Senior Member charleski's Avatar
    Join Date
    Jul 2006
    Posts
    1,586
    Thanks
    7
    Thanked
    52 times in 45 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    Making the web content available in an unrestricted fashion grants you an implicit licence to download and view said content, copyright only comes into it if you were to redistribute the content.

    If the site is dynamically generated using server-side scripting then nothing will allow you to download an exact copy. The tools mentioned, though, will generate a folder structure that you could search using Google Desktop or a similar search tool.

  10. #10
    Admin (Ret'd)
    Join Date
    Jul 2003
    Posts
    18,481
    Thanks
    1,016
    Thanked
    3,208 times in 2,281 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    Quote Originally Posted by charleski View Post
    Making the web content available in an unrestricted fashion grants you an implicit licence to download and view said content, copyright only comes into it if you were to redistribute the content.
    Putting material on the web may be viewed as granting an implicit licence, but it will only be an implicit licence to do the things people would expect in the normal course of web usage. I would suggest that that includes browsing a website, together with the implicit caching issues, but does not imply wholesale download, storage or archive. Furthermore, it ONLY grants an implicit licence if the explicit terms of that licence don't preclude it.

    retroborg didn't say what site it was. It may well be that he has permission to download this site. It might even be his own site. Or it may be that the T&Cs explicitly exclude such usage.

  11. #11
    ɯʎɔɐɹsɐʌʍ mycarsavw's Avatar
    Join Date
    Feb 2007
    Posts
    4,945
    Thanks
    1,097
    Thanked
    653 times in 482 posts
    • mycarsavw's system
      • Motherboard:
      • P8H77-M Pro
      • CPU:
      • i5 3350P
      • Memory:
      • 16Gb
      • Storage:
      • Lots
      • Graphics card(s):
      • R9 285
      • PSU:
      • HX 620w
      • Case:
      • FD Define Mini
      • Operating System:
      • W10
      • Monitor(s):
      • BenQ G2420HDBL + GL2450HT
      • Internet:
      • Sky

    Re: Create browsable copy of a website on my hard drive (Tool)

    Backstreet Browser

    Crap name, not so crap program.

    EDIT: Holy thread revival, sorry
    |Kata: "Read title as 'fisting'. Not sure why I clicked. Relieved, really."|
    |TAKTAK: "It was so small that mine wouldn't fit into it"|

  12. #12
    Senior Member retroborg's Avatar
    Join Date
    Aug 2003
    Posts
    679
    Thanks
    19
    Thanked
    19 times in 13 posts

    Re: Create browsable copy of a website on my hard drive (Tool)

    Thanks!
    Will I be able to get the search engine as well with that app?

  13. #13
    ɯʎɔɐɹsɐʌʍ mycarsavw's Avatar
    Join Date
    Feb 2007
    Posts
    4,945
    Thanks
    1,097
    Thanked
    653 times in 482 posts
    • mycarsavw's system
      • Motherboard:
      • P8H77-M Pro
      • CPU:
      • i5 3350P
      • Memory:
      • 16Gb
      • Storage:
      • Lots
      • Graphics card(s):
      • R9 285
      • PSU:
      • HX 620w
      • Case:
      • FD Define Mini
      • Operating System:
      • W10
      • Monitor(s):
      • BenQ G2420HDBL + GL2450HT
      • Internet:
      • Sky

    Re: Create browsable copy of a website on my hard drive (Tool)

    I used it a few years ago on a small site, and once downloaded I used Windows Explorer/Search to search for stuff.
    |Kata: "Read title as 'fisting'. Not sure why I clicked. Relieved, really."|
    |TAKTAK: "It was so small that mine wouldn't fit into it"|

  14. #14
    RIP Evy mroz's Avatar
    Join Date
    Jul 2007
    Location
    A wonderful avatar filled place
    Posts
    588
    Thanks
    40
    Thanked
    16 times in 15 posts
    • mroz's system
      • Motherboard:
      • Gigabyte P35-DS4 rev 1.1
      • CPU:
      • Q6600 G0 @ 2.4GHz (was @ 3.2GHz), TRU120X (lapped) + Sythe S-Flex 1600rpm
      • Memory:
      • Corsair 6GiB DDR2 Twin2X 6400 C4 (was 2GiB)
      • Storage:
      • Samsung Spinpoint 500GB x 2
      • Graphics card(s):
      • GTX 460 (was Gigabyte 7600GS passive)
      • PSU:
      • Corsair HX 520
      • Case:
      • Antec 900 aka The Vacuum Cleaner
      • Monitor(s):
      • They're everywhere
      • Internet:
      • Zen upto 75Mb/s (typically 26Mb/s when no one else is using the internet)

    Re: Create browsable copy of a website on my hard drive (Tool)

    Quote Originally Posted by retroborg View Post
    Thanks!
    Will I be able to get the search engine as well with that app?
    Strictly speaking that depends on how the searching is implemented, but in reality NO - as has already been explained multiple times.

    Such sites utilise dynamic content. This means there are programs running on the server which perform the searches on their internal data & which most likely generate the pages themselves. Even if you ran the software on your own machine to support such programs, there's no way you can download the site's program code itself - it isn't visible to anyone other than the site maintainers.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Which Hard Drive?
    By Lucio in forum PC Hardware and Components
    Replies: 4
    Last Post: 11-06-2007, 09:48 AM
  2. Possible hard drive problem... or maybe it's windows update?
    By McClane in forum Help! Quick Relief From Tech Headaches
    Replies: 4
    Last Post: 21-04-2007, 06:13 PM
  3. Looking for a Database Programmer
    By tillyoubreakit in forum Software
    Replies: 35
    Last Post: 25-04-2004, 05:00 PM
  4. Replies: 10
    Last Post: 24-09-2003, 02:47 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •