Back to Question Center
0

Semalt: 3 Matanho Ku PHP Web Page Kugadzirisa

1 answers:

Kugadzirwa kwewebhu, kunonziwo web data extraction kana kubvunzwa kwewebhu, ndiyo hurongwa hwekubvisa deta kubva pawebsite kana blog. Iyi ruzivo inoshandiswa kugadzirisa meta tags, meta tsanangudzo, keywords uye mazano kune site, kuvandudza kushanda kwayo mukati mitsva yekutsvaga injini.

Maitiro maviri makuru anoshandiswa kuparadzanisa data:

  • Document parsing - Inosanganisira bhuku re XML kana HTML rinoshandurwa kuDOM (Document Object Model ) mafaira - transport von moebeln. PHP inotipa isu nekuwedzerwa kukuru kweDOM.
  • Nguva dzose mazwi - Iyi nzira yekuchera dhiyabhorosi kubva pamagwaro matsamba nenzira yekutaura nguva dzose.

Nyaya ine tsvina yedhesi yebasa rechitatu yakabatana nekodzero yayo nokuti hauna mvumo yekushandisa deta iyi. Asi neFP, unogona nyore nyore kuparadzanisa dambudziko pasina matambudziko akabatana nekodzero dzekodzero kana hutsika. Sekuve purogiramu yeFP, iwe unogona kudha data kubva kune mawebsite akasiyana-siyana ekutsvaga coding zvinangwa. Pano tave tatsanangura nzira yekuwana dhiyabhorosi kubva kune dzimwe nzvimbo zvakanaka, asi zvisati zvaitika, unofanira kufungisisa kuti pamagumo iwe uchawana mafaira index.php kana scrape.js.

Steps1: Gadzira Fomu kupinda muIndaneti URL:

Chokutanga pane zvose, unofanira kuumba fomu mu index.php nekudhonza paAngeni bhokisi uye ugoisa URL yewebsite yekutsvaga deta.



)



Matanho2: Shandisa PHP Basa Kuti Uwane Website Data:

Danho rechipiri nderekusika PHP basa inowedzera mu scrape.php file sezvo ichabatsira kuwana dhiyabhorosi uye kushandisa tsamba re URL. Ichakubvumirawo kuti ubatane uye utaurirane nemaseva akasiyana-siyana nemitemo pasina chero dambudziko..

basa scrapeSiteData ($ website_url) {

kana (! Function_exists ('curl_init')) {

kufa ('cURL haisi kuiswa. ');

}

$ curl = curl_init

;

curl_setopt ($ curl, CURLOPT_URL, $ website_url);

curl_setopt ($ curl, CURLOPT_RETURNTRANSFER, ichokwadi);

$ zvakabuda = curl_exec ($ curl);

curl_close ($ curl);

dzoka $ output;

}

Pano, tinogona kuona kana PHP cURL yakaiswa zvakanaka kana kwete. Zvinhu zvitatu zve-cURL zvinofanira kushandiswa mumabasa munzvimbo uye curl_init

inobatsira kutanga mapurogiramu, curl_exec

achazviita uye curl_close

ichabatsira kuswedera kwekubatana. Zvimwe zvinhu zvakadai seCURLOPT_URL zvinoshandiswa kugadzirisa URL dzewebsite dzatinoda kuti tidziye. Chechipiri CURLOPT_RETURNTRANSFER inobatsira kuchengeta mapeji akadzikidzwa mufomu yakasiyana pane pane fomu yayo isina kuitika, iyo ichazopedzisira yakaratidza yose yewebhu peji.

Matanho3: Nyaya Dzakanyatsotsanangurwa kubva paIndaneti:

Inguva yekubata maitiro eFP yako faira uye chengeta chikamu chakati cheji yewebhu yako. Kana iwe usingadi data yose kubva pane imwe URL, unofanirwa kushandura shandisa zvinyorwa zve CURLOPT_RETURNTRANSFER uye kusimbisa zvikamu zvaunoda kuparadzira.

kana (isset ($ _ POST ['submit'])) {

$ html = scrapeWebsiteData ($ _ POST ['website_url']);

$ start_point = strpos ($ html, 'Latest Posts');

$ end_point = strpos ($ html, '', $ start_point);

$ urefu = $ end_point- $ start_point;

$ html = substr ($ html, $ start_point, $ urefu);

echo $ html;

Tinokurudzira kuti udziridze ruzivo rwepamweya rweFP uye Zveso Zvekare usati washandisa chero ipi zvayo kana kutora rimwe bhazi kana webhusaiti kuitira zvinangwa zvako.

December 8, 2017