• Howdy! Welcome to our community of more than 130.000 members devoted to web hosting. This is a great place to get special offers from web hosts and post your own requests or ads. To start posting sign up here. Cheers! /Peo, FreeWebSpace.net
managed wordpress hosting

Need windows to compile?

Status
Not open for further replies.

heymrdj

The Debian Lover
NLC
I've been assigned the Danbooru Ressurection project on EA Networx. I now have in my possession the 20GB "image" of the site. Now one step required is running php -f rip.php 0 1041 to rip all 1041 bases. The rip.php script however is written to use php_domxml.dll which is a Windows PHP script. I believe Linux has php_domxml.so compiled in the php.ini, which is supposed to be the Linux version of php_domxml.dll. Question is how do I make my Debian 3.1 server run it properly? My Windows 2000 Professional running Windows Apache 2.2.4 and Windows PHP 4.2.1 compiles it fine in theory. But is there any way to run it on a Linux server?

rip.php
PHP:
<?php

dl('php_domxml.dll');

$server = "http://danbooru.lolitron.org/data";

$start = 0;
$end = 1041;

if ($argc == 3) { $start = $argv[1]; $end = $argv[2]; }
echo "running from $start to $end\n\n\n";

for($i = $start; $i <= $end; $i++) {
  $xmlfile = sprintf("xml/%06d00.xml", $i);
  if(!file_exists(dirname(__FILE__).'/'.$xmlfile.'.done')) {
    echo "$xmlfile:\n--------\n\n";
    $dom = domxml_open_file(dirname(__FILE__).'/'.$xmlfile);
    if($dom) {
      foreach($dom->get_elements_by_tagname('post') as $post) {
        echo $file = $post->get_attribute('file_name');
        $bytes = file_put_contents(dirname(__FILE__).'/pics/'.$file, file_get_contents($server.'/'.$file[0].$file[1].'/'.$file[2].$file[3].'/'.$file));
        echo ($bytes > 0) ? ' '.$bytes." bytes\n" : " failed\n";
      }
      $dom->free();
      touch($xmlfile.'.done');
    }
  }
}

?>

When I run this script however I get the following:
Code:
PHP Warning: Unknown(): Unable to load dynamic library './php_domxml.dll' - ./php_domxml.dll: cannot open shared object file: No such file or directory in Unknown on line 21


(thats just an example, it generates an error just like that one, except the script name is different, the line number is however correct)

Any way on how I can get this script to run in a Linux environment? It used to be run on Bluehost...and they only do Linux I believe, so surely there is a way? :confused4
 
Last edited:
It'll be php_domxml.so on linux
Yeah I got that much figured out. But when I re-wrote the dl line with php_domxml.so it gave me the exact same error but with .so instead of .dll

EDIT:

I took a picture of the error. The execution code and the resulting error is highlighted in red.

Error
 
Last edited:
You need to goto /usr/lib/php4 and do some ls'ing in the folders around about to see if it does actually exist, if it doesn't youre probably going to have to build it from sources.

http://pecl.php.net/package/domxml


It shows it's there: http://img503.imageshack.us/img503/9799/error2hx0.png


Here is my current rip.php.
PHP:
<?php

dl('php_domxml.so');

$server = "http://danbooru.lolitron.org/data";

$start = 0;
$end = 1041;

if ($argc == 3) { $start = $argv[1]; $end = $argv[2]; }
echo "running from $start to $end\n\n\n";

for($i = $start; $i <= $end; $i++) {
  $xmlfile = sprintf("xml/%06d00.xml", $i);
  if(!file_exists(dirname(__FILE__).'/'.$xmlfile.'.done')) {
    echo "$xmlfile:\n--------\n\n";
    $dom = domxml_open_file(dirname(__FILE__).'/'.$xmlfile);
    if($dom) {
      foreach($dom->get_elements_by_tagname('post') as $post) {
        echo $file = $post->get_attribute('file_name');
        $bytes = file_put_contents(dirname(__FILE__).'/pics/'.$file, file_get_contents($server.'/'.$file[0].$file[1].'/'.$file[2].$file[3].'/'.$file));
        echo ($bytes > 0) ? ' '.$bytes." bytes\n" : " failed\n";
      }
      $dom->free();
      touch($xmlfile.'.done');
    }
  }
}

?>

I am currently working on installing Windows 2000 with Windows Apache and Windows PHP to test the Windows theorem. But I would still really prefer a linux opeartion as it would be MUCH cheaper to host on several load balanced dedis.
 
No expert or anything but I think you might have mis-named the file. You're trying to use "php_domxml.so" instead of just "domxml.so" cuz that's what I see in your screen shot of the file. Notice no "php_" in the actual fileName. Hope that's the right solution. Peaces I got class!
 
No expert or anything but I think you might have mis-named the file. You're trying to use "php_domxml.so" instead of just "domxml.so" cuz that's what I see in your screen shot of the file. Notice no "php_" in the actual fileName. Hope that's the right solution. Peaces I got class!

Umm...I don't think it worked. It gave me tons of WARNING: Function Registration failed - duplicate name - errors, one for each group so about 30 total. All of them said on line 3 of rip.php I think, however, it did allow it to find the .so. After all those Function Registration failed errors it says WARNING: domxml : Unable to register functions, unable to load in Unknown on line 0 running from 0 to 1041

This is so confusing :S.
 
Last edited:
If it helps, this is what I'm trying to do:

Code:
danbooru rip
------------

- based on the original xml-dump
- used mirror: lolitron
- some files missing due to being not mirrored or having wrong names
    (check 'val.log' to see them)


Files:
  {x}.zip        contains all the files; split by first char in filename
  xml.zip        original xml-dump
  archive.bat    script for creating the archives (uses posix zip)
  rip.php        script used for mirroring
  validate.php    script used to validate mirrored data
  val.log        stdout of validate.php
  export.php    script for exporting images from the zips based on a tag
  readme.txt    that one should be pretty obvious :p



[rip]
    php -f rip.php [start end]
    
    Default for start and end are 0 and 1041.
    Needs xml.zip extracted (./xml/*.xml) and a ./pics folder.
    

[validate]
    php -f validate.php [start end]
    
    Default for start and end are 0 and 1041.
    Needs xml.zip extracted (./xml/*.xml) and a ./pics folder.


[export]
    php -f export.php tag [tag ...]
    
    Will create a folder named after the tags and put the images inside.
    May take some time due to the large number of files inside the archives.
    The files get named after a combination of their tags and the hash.
      If a file has too many tags, a part of thm will be cut.
      (Filesystems have a limit how long the path can be.)



------
Any code is under a "do whatever you want with it, as long as you don't claim
 you wrote it yourself"-license.
 
Have fun ^^
 ~ Ata

I have the pics folder, and I have the xml's extracted. I just don't know what is going wrong. The command it attempting to run on the xml files. It starts with the first of 1,042 of them (yes there are 1,042 of the 31KB XML files :knockedou) You can see it running on xml file 00000000.xml, which is the first. Here's an exerpt: WARNING MAY CONTAIN SOME EXPLICIT TEXT AS THEY ARE TAGGED BY CATEGORY:

Code:
 <?xml version="1.0" encoding="UTF-8"  ?> 
  [URL="http://www.freewebspace.net/forums/#"]-[/URL] <posts>
     <post author="[B]albert[/B]"  score="[B]19[/B]"  date="[B]Wed Feb 28 04:37:50 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]6b919452c7d722f9d8a82de6ac12dbe0[/B]" id="[B]113970[/B]"  source="[B][/B]" file_name="[B]6b919452c7d722f9d8a82de6ac12dbe0.jpg[/B]" tags="[B]animal_ears  breasts cat_ears final_fantasy final_fantasy_xi mithra naughty_face open_shirt  redhead short_hair sonobe_kazuaki wardrobe_malfunction[/B]" rating="[B]Questionable[/B]"  /> 

    <post author="[B]Ryw[/B]"  score="[B]3[/B]"  date="[B]Wed Feb 28 04:37:42 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]a4ad16993d0183a77b632b391903f42c[/B]" id="[B]113969[/B]"  source=""  file_name="[B]a4ad16993d0183a77b632b391903f42c.gif[/B]" tags="[B]breasts  monochrome nude robin_sena sketch witch_hunter_robin[/B]" rating="[B]Questionable[/B]"  /> 

    <post author="[B]albert[/B]"  score="[B]2[/B]"  date="[B]Wed Feb 28 04:37:24 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]f98caaa57942a885abe26e1f57ae95d8[/B]" id="[B]113968[/B]"  source="[B][/B]" file_name="[B]f98caaa57942a885abe26e1f57ae95d8.png[/B]" tags="[B]genom oekaki  seifuku sketch[/B]" rating="[B]Safe[/B]" />  

    <post author="[B]albert[/B]"  score="[B]15[/B]"  date="[B]Wed Feb 28 04:36:28 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]38c5b9dbaa3ea687ce2662fd28382b1f[/B]" id="[B]113967[/B]"  source="[B][/B]" file_name="[B]38c5b9dbaa3ea687ce2662fd28382b1f.jpg[/B]" tags="[B]cc  code_geass topless undressing[/B]"  rating="[B]Questionable[/B]" /> 

    <post author="[B]albert[/B]"  score="[B]15[/B]"  date="[B]Wed Feb 28 04:36:11 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]1c826ec4e084e8344a84b5d4267b285b[/B]" id="[B]113966[/B]"  source="[B][/B]" file_name="[B]1c826ec4e084e8344a84b5d4267b285b.jpg[/B]" tags="[B]animal_ears  bat_wings cat_ears earmuffs miniskirt panties pantyshot skirt tail thighhighs  zettai_ryouiki[/B]" rating="[B]Safe[/B]" />  

    <post author="[B]Ryw[/B]"  score="[B]3[/B]"  date="[B]Wed Feb 28 04:35:33 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]8e682edb03b03cdb9c9db54307843c4e[/B]" id="[B]113965[/B]"  source=""  file_name="[B]8e682edb03b03cdb9c9db54307843c4e.jpg[/B]" tags="[B]gothic  ice_cream robin_sena witch_hunter_robin[/B]"  rating="[B]Safe[/B]" /> 

    <post author="[B]albert[/B]"  score="[B]5[/B]"  date="[B]Wed Feb 28 04:28:26 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]121deba387f4ed4ce46759be11d26252[/B]" id="[B]113964[/B]"  source="[B][/B]" file_name="[B]121deba387f4ed4ce46759be11d26252.jpg[/B]" tags="[B]animal_ears  fox_ears seifuku tail[/B]" rating="[B]Safe[/B]" />  

    <post author="[B]albert[/B]"  score="[B]4[/B]"  date="[B]Wed Feb 28 04:25:00 UTC 2007[/B]" is_warehoused="[B]true[/B]" md5="[B]42cc9f39321847db4d911ca5cf7a054b[/B]" id="[B]113963[/B]"  source="[B][/B]" file_name="[B]42cc9f39321847db4d911ca5cf7a054b.jpg[/B]" tags="[B]cameltoe  school_swimsuit swimsuit torikoro[/B]"  rating="[B]Questionable[/B]" />

You'll see the first code runs on image 6b919452c7d722f9d8a82de6ac12dbe0. So apparently it is getting hold of the xml as it is supposed to...but it's not parsing right or something. :confused4

EDIT: Removed source links from the XML file so no one comes at me for posting panty shot anime characters :p.
 
Last edited:
danbooru~

Try putting the .so in the same directory as the php file, and maybe change the line to dl('./php_domxml.so');
 
ok it won't be php_domxml.so as thats not the filename and you call by filename the only thing that is automatically detected is the location of the extension.

I would also say that it might be worth rebuilding the library and posting the build output here so I/we can see that, theres a high chance there is something wrong with the library itself as on post7 you said about function registration errors - that means it's trying to load a broken library - but at the very least it's trying to load it.......
 
ok it won't be php_domxml.so as thats not the filename and you call by filename the only thing that is automatically detected is the location of the extension.

I would also say that it might be worth rebuilding the library and posting the build output here so I/we can see that, theres a high chance there is something wrong with the library itself as on post7 you said about function registration errors - that means it's trying to load a broken library - but at the very least it's trying to load it.......

Alright I'll give it a shot. I got in contact with someone who has used it before, and I found out I have to have a vanilla install first. That requires PostGreSQL 8.2 or later, Ruby on Rails 1.8.2 or later, and Mongrel proxied in Apache. So first I have to redo my Ubuntu installation...yadayadayada. I'll let you know when I get there. :lol:.

I did however find out that the three scripts are indeed built for a Windows server, but the owner left a note saying that porting them to Linux and Mac was extremely easy. Why he never did the coding himself and released the scripts I have no idea :tired2:. But if that gives you some viewpoint as to what you're looking at and what needs done, then you at least have that info now.
 
Anime kiddie porn. Way to go man :rolleyes:

Lol, the founder was an anime freak, so Lolitron agreed to hold the massive 35GB mirror archive when he had to leave :p.

Danbooru was a massive gallery, you can tell that because these 1,042XML files contain well over 120,000 images. It was honey bees, to naked girls, to hentai, to cars, to lolita. But yeah....most of it was getting to be pr0n :p.
 
That's just wrong man. "Little anime girls"? Why not just regular girls? Next thing you know, he'll be driving 8 hours to a some house to meet a 12 year old boy just to find Chris Hansen and a bunch of camera/lights. ----ing sicko!
 
Status
Not open for further replies.
Back
Top