|Subject:||Re: [PD] http, html and textfiles|
|From:||mark edward grimm (mgr...@syr.edu)|
|Date:||May 2, 2008 6:33:55 am|
If we WERE going to use just pd for longer text type processing, what optimization methods would be recommended?
Is there a particular font that PD handles better than others? Would it be wise to strip text of "backslashes, spaces, commas...." as im assuming from your post? Can PD grab a random 'line' from a text file so as not to have to load the whole thing?
--- Frank Barknecht <fb...@footils.org> wrote:
Hallo, wolfgang schwarzenbrunner hat gesagt: // wolfgang schwarzenbrunner wrote:
i am working on a little project in which websites are going to be parsed. well. i thought this might be a nice thing using the regex object from zexy... the only problem i am facing right now is that i have no idea how i could get a html file on my harddisk using pd (something like a http browsing object)...
Yep: Don't use Pd for text processing.
Pd is good at many things, but it's not good at parsing and modifying larger amounts of text. AFAIK there still is no garbage collection for unused symbols (Pd's "strings"), it's overcomplicated to deal with certain characters (backslashes, spaces, commas, ...) when they should not be interpreted by Pd etc.
What I would recommend is to do your text processing in a different language. Many (scripting) languages that are great with text can be used inside of Pd: Lua, Python, Java, Scheme, etc. Most of these also include or can be extended easily with nice web browsing tools (CURL, Socket, system("wget") ...). In the end you can do both the browsing and all processing in one place and then only need to feed the results over to Pd in a format, Pd can handle with more elegance than it can handle large amounts of text.
Of course it depends a bit on how complex your project is, so you may get away with pure Pd as well, but IMO it's a better use of Pd to externalize the text processing to a language better suited.
-- Frank Barknecht
____________________ mark edward grimm | m.f.a | ed.m syracuse u. | vpa foundations | timearts adjunct | new media consultant megrimm.net | socialmediagroup.org & .com mgr...@syr.edu | 315.378.2136