Professional ASP.NET 1.1 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Professional ASP.NET 1.1 [Electronic resources] - نسخه متنی

Alex Homeret

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید






From Web Site to Web Service

HTML screen-scraping is the practice of connecting to a web site, retrieving a result, (usually HTML), and then sifting through the unstructured data to extract a useful value. For example, to find the current sales rank for this book, you could look it up by ISBN (0-7645-5890-0) using http://shop.barnesandnoble.com/booksearch/isbnInquiry.asp?isbn=0764558900 on Barnes and Noble.

You could then view the HTML source and search for 'sales rank'. You would find something like:

<font size="-1">sales rank: 1,823</font>

In the past, to automate this through code, we would write a small application that would connect to BarnesAndNoble.com through

WinInet, or the

XMLHttp component, and request the document. Then we'd write some code to search the HTML for a string that matched

<font size="-1">sales rank:

and also search for a string immediately following the previous string of

</font> . Anything found between these two strings would be the result.

This isn't efficient, and the code to perform this match would be very fragile. If the HTML changed, your code would no longer function correctly.

.NET makes a lot of this much easier. For example, there is support for regular expression pattern-matching. Finding the sales rank value in the preceding string is now simply a matter of creating the appropriate regular expression syntax, and then searching the document:


"size=.-1.>sales rank:.(.*?)</"


This regular expression search string returns the sales rank.





Note

You can find more information on regular expressions and pattern-matching in Chapter 16.


While regular expressions simplify searching for strings, you still need to write all the other code to access the site and return the HTML, as well as wrapping all of this in a friendly API.

ASP.NET Web services automates much of this by allowing you to build custom WSDL documents that specify the location, parameters, regular expression to match, and return types. You can then use one of your proxy generation tools, such as Visual Studio .NET or

wsdl.exe , to generate a proxy object to encapsulate this.


Authoring the WSDL


Let's write an example that returns the sales rank for any book, for a provided ISBN. First, the WSDL:


<?xml version="1.0"?>

<definitions xmlns:s="http://www.w3.org/2000/10/XMLSchema"

xmlns:http="http://schemas.xmlsoap.org/wsdl/http/"

xmlns:mime="http://schemas.xmlsoap.org/wsdl/mime/"

xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"

xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"

xmlns:s0="http://tempuri.org/"

targetNamespace="http://tempuri.org/"

xmlns="http://schemas.xmlsoap.org/wsdl/"

xmlns:msType="http://microsoft.com/wsdl/mime/textMatching/">

<types/>

<message name="GetBookDetailsHttpGetIn">

<part name="isbn" type="s:string"/>

</message>

<message name="GetBookDetailsHttpGetOut"/>

<portType name="BarnesAndNobleHttpGet">

<operation name="GetBookDetails">

<input message="s0:GetBookDetailsHttpGetIn"/>

<output message="s0:GetBookDetailsHttpGetOut"/>

</operation>

</portType>

<binding name="BarnesAndNobleHttpGet" type="s0:BarnesAndNobleHttpGet">

<http:binding verb="GET"/>

<operation name="GetBookDetails">

<http:operation location="/booksearch/isbnInquiry.asp"/>

<input>

<http:urlEncoded/>

</input>

<output>

<msType:text>

<msType:match name="Rank"

pattern="size=.-1.&gt;sales rank:.(.*?)&lt;/"

ignoreCase="true"/>

</msType:text>

</output>

</operation>

</binding>

<service name="BarnesAndNoble">

<port name="BarnesAndNobleHttpGet" binding="s0:BarnesAndNobleHttpGet">

<http:address location="http://shop.barnesandnoble.com"/>

</port>

</service>

</definitions>


This WSDL defines a

<service> named

BarnesAndNoble , which also names the end-point, http://shop.barnesandnoble.com that we'll make queries against. This service does not use SOAP, but instead uses HTTP GET to make requests. There are two other areas of interest in the WSDL elements you just saw:

<binding> and

<message name="GetBookDetailsHttpGetIn"> .

The

<binding> element defines an operation,

GetBookDetails , that further qualifies the end-point to which the HTTP-GET request is sent (

/booksearch/isbnInquiry.asp ). It also defines, in the

<output> section, a

<msType:match ...> element . Within a

<match> element, we declare the regular expression syntax used for our string match. The value of the

name attribute allows you to control the name of the property the proxy will create, which will be used to access the 'sales rank' of a given ISBN. The

pattern value is the regular expression pattern used for searching the document and returning results with.

The other section of interest is the

<message name="GetBookDetailsHttpGetIn"> element, which is used to define parameters that we want to send as part of the HTTP GET request. The defined value,

isbn (which is of type

string ) will be used to formulate a request, such as, http://shop.barnesandnoble.com/booksearch/isbnInquiry.asp?isbn=1861004753, or http://shop.barnesandnoble.com/booksearch/isbnInquiry.asp?isbn=1861004885.

We're now ready to build a proxy for this WSDL.


Building the Proxy


Open a command prompt and move to the directory where the WSDL was created. Then issue the following command:


wsdl.exe /language:VB bn.wsdl

If there were no errors in the WSDL, you should have a VB.NET source file named

BarnesAndNoble.vb . Let's look at the source of this file (some elements not relevant to this discussion, namely comments and the asynchronous methods, have been removed):


Option Strict Off

Option Explicit On


Imports System

Imports System.ComponentModel

Imports System.Diagnostics

Imports System.Web.Services

Imports System.Web.Services.Protocols

Imports System.Xml.Serialization


<System.Diagnostics.DebuggerStepThroughAttribute(), _

System.ComponentModel.DesignerCategoryAttribute("code")> _

Public Class BarnesAndNoble

Inherits System.Web.Services.Protocols.HttpGetClientProtocol


Public Sub New()

MyBase.New

Me.Url = "http://shop.barnesandnoble.com"

End Sub


<System.Web.Services.Protocols.HttpMethodAttribute(GetType(System.Web.Services._

Protocols.TextReturnReader),

GetType(System.Web.Services.Protocols.UrlParameterWriter))> _

Public Function GetBookDetails(ByVal isbn As String) As GetBookDetailsMatches

Return CType(Me.Invoke("GetBookDetails", (Me.Url +

"/booksearch/isbnInquiry.asp"),

New Object() {isbn}),GetBookDetailsMatches)

End Function


Public Function BeginGetBookDetails(ByVal isbn As String, ByVal callback As

System.AsyncCallback, ByVal asyncState As Object) As System.IAsyncResult

Return Me.BeginInvoke("GetBookDetails", (Me.Url +

"/booksearch/isbnInquiry.asp"), New Object() {isbn}, callback, asyncState)

End Function


Public Function EndGetBookDetails(ByVal asyncResult As System.IAsyncResult) As

GetBookDetailsMatches

Return CType(Me.EndInvoke(asyncResult),GetBookDetailsMatches)

End Function

End Class


Public Class GetBookDetailsMatches

<System.Web.Services.Protocols.MatchAttribute("size=.-1.>sales rank:.(.*?)</",

IgnoreCase:=true)> _

Public Rank As String

End Class


This auto-generated source file contains two classes. The first class,

BarnesAndNoble , has a single function called

GetBookDetails that accepts an ISBN as a parameter and returns an instance of

GetBookDetailsMatches . The returned type,

GetBookDetailsMatches , is the second class defined in the source file. It contains a single member variable

Rank . The

Rank member variable has an attribute applied to it that represents the regular expression syntax declared in the WSDL.

Compile this source file by executing the following command (note that this command should be typed all on one line):


vbc.exe /t:library /r:System.dll /r:System.Web.dll /r:System.Web.Services.dll
/r:System.Xml.dll BarnesAndNoble.vb

This will generate an assembly named

BarnesAndNoble.dll .


Using the Screen Scrape Proxy


Now that an assembly is built, you can deploy it to the

\bin directory of a web application. You can then write the following ASP.NET page (using VB.NET here) that loads and uses the

BarnesAndNoble proxy:


<%@ Import Namespace="System.Net" %>


<Script runat=server>

Public Sub GetSalesRank(sender As Object, e As EventArgs)

Dim bn As New BarnesAndNoble()

Dim match As GetBookDetailsMatches

match = bn.GetBookDetails(isbn.Value)


rank.Text = match.Rank

End Sub

</Script>

<font face=arial>

<form runat=server>

ISBN number: <input type=text id="isbn" runat=server/>

<input type=submit id=submit onserverclick="GetSalesRank" runat=server/>

</form>

Sales Rank: <font color=red><b><asp:label id=rank runat=server/></b></font>

</font>


Using C#, you would write:

<%@ Page Language="C#" %>
<%@ Import Namespace="System.Net" %>
<Script runat="server">
public void GetSalesRank(Object sender, EventArgs e) {
BarnesAndNoble bn = new BarnesAndNoble();
GetBookDetailsMatches match;
match = bn.GetBookDetails(isbn.Value);
rank.Text = match.Rank;
}
</Script>
<font face=arial>
<form runat=server ID="Form1">
ISBN number: <input type=text id="isbn" runat=server NAME="isbn"/>
<input type=submit id=submit onserverclick="GetSalesRank" runat=server
NAME="submit"/>
</form>
Sales Rank: <font color=red><b><asp:label id=rank runat=server/></b></font>
</font>

This ASP.NET page creates a new instance of the

BarnesAndNoble proxy object in the

GetSalesRank event handler for when the input button is clicked. The

BarnesAndNoble instance,

bn , is used to call the

GetBookDetails method (passing in the ISBN number):


Figure 20-7:

In the screenshot shown in Figure 20-7 you can see that the ASP.NET page has successfully queried BarnesAndNoble.com for the sales rank of this book.





Note

This screen-scraping feature of ASP.NET Web services allows you to turn any web site into a web service. You can simply author the WSDL, and VS.NET or

wsdl.exe takes care of the rest.


/ 244