this post was submitted on 08 Jul 2023
2 points (100.0% liked)

commandline

1696 readers
2 users here now

founded 1 year ago
MODERATORS
 

I need help with wget and I hope to find it here. My goal ist to mirror a site (the soon to be replaced website at work) and I tried to do so with wget. I am totally able to mirror the pages and filter out the unwanted file types.

My problem is this: on several pages (e.g. /internal/forms) there are linked files like this /files/12345/form1.docx. So wget doesn't save the file in the folder internal/forms but creates a new folder files/12345.

I understand why this happens but I really need the file in internal/forms and I can't find a solution - is there any way to achieve that? Thank you so much for your help!

top 1 comments
sorted by: hot top controversial new old
[–] [email protected] 2 points 1 year ago

I don't think there's a way to make the links visible to wget. You could maybe try creating symlinks for the files in their incorrect locations. But that involves creating a bunch of incorrect directories. Alternatively you can let wget do its thing and clean up afterwards.