English
Forum

Großbuchstaben in a text string search

 

Thomas
Freier
I have time a question, How is the following trouble rather (faster) To solve:
In of/ one Textdatei, created a PDF through Adobe or PDFtoText.exe, go words zusammengezogen. These would I gladly again separate, if within one Wortes Großbuchstaben come along. is a preparation for Sprachausgabe and it hears not well on, if to one Hauptwort no space is. my previous Solution:
CompileMarkSeparation
Zusammengefasste Wörter bei einem Großbuchstaben trennen
Dim$ 500
declare this$, insideThis$, Nr_B%, Nr_B1%, Such$ , Such%, Edit$, N_Strg%, Ende%

Proc isOneInside   von Michael Wodrich

    Syntax: isOneInsice( this$, insideThis$ )
    Ist eines der Zeichen in this auch in insideThis enthalten?
    this wird als eine Aufzaehlung von einzelnen Zeichen betrachtet.
    Parameters this$, insideThis$
    Declare Ok%

    WhileLoop Len(this$)

        If InStr(Mid$(this$, &loop, 1), insideThis$)

            If &Loop>Nr_B1%

                Ok% = 1
                Nr_B% = &Loop
                BREAK

            EndIf

        EndIf

    EndWhile

    Return Ok%

EndProc

cls
Edit$=WastutLübeckaufdemReisemarktHeute ohneTourismusmanager
Print Edit$
insideThis$=ABCDEFGHIJKLMNOPQRSTUVWXYZÄÜÖ
..............Alle Worte trennen und speichern

WhileLoop 50

    List$ &Loop = Substr$(Edit$,&Loop, )

    IF List$(&Loop) =

        N_Strg%= &Loop-1
        Break

    EndIf

EndWhile

.............ist innerhalb eines Wortes ein Großbuchstabe?

WhileLoop n_Strg%

    Such$= List$(&Loop)
    print such$
    this$= right$(Such$,len(Such$)-1)
    Nr_B1% = 0
    ..........................suchen ab Wortanfang ohne 1. Buchstaben

    WhileLoop 10

        if isOneInside( this$, insideThis$ )

            Such$= Ins$( ,Such$,(Nr_B% + 1)) gefunden und Leerzeichen vor Großbuchstaben
            Nr_B1% = Nr_B% + 1                 einsetzen
            this$= right$(Such$,len(Such$)-1)

        Else

            Break

        endif

    Wend

    print such$
    Print
    List$ &Loop = Such$

EndWhile

Print
.............String neu zusammen setzen
Edit$ =

WhileLoop n_Strg%

    Edit$ = Edit$ + List$(&Loop)+

EndWhile

WaitKey
 
Gruß Thomas
Windows XP SP2, XProfan X2
11/21/07  
 




Jörg
Sellmeyer
so to that example:
CompileMarkSeparation
ab XProfan10
Declare t$
Var Edit$=WastutLübeckinÖstereichaufdemReisemarktHeuteohneTourismusmanager
Print Edit$

While Match$(([a-z|äöü])([A-Z|ÄÖÜ]),Edit$) >

    t$ = $Match
    Edit$ = Translate$(Edit$,t$,Left$(t$,1) +   + Right$(t$,1))

Wend

Print Edit$
put

CompileMarkSeparation
On The same point can You yet The whole Präpositionen and Bindewörter freistellen and you have already a half-way lesbaren Text. If you still any Opportunities of his and having as well as some Basisverben (z.B. make) erledigst...
Achja, The item yet! with the right Match$-command does it everything in a Loop.
 
Windows XP SP2 XProfan X4
... und hier mal was ganz anderes als Profan ...
11/21/07  
 




Thomas
Freier
@ Jörg, THANK YOU, sees short and well from.
 
Gruß Thomas
Windows XP SP2, XProfan X2
11/21/07  
 




Jörg
Sellmeyer
Thomas suitor
@ Jörg, THANK YOU, sees short and well from.


it is short and well
 
Windows XP SP2 XProfan X4
... und hier mal was ganz anderes als Profan ...
11/21/07  
 




Jörg
Sellmeyer
Hello Thomas,
I answer time here, then having any what of it:
Trennstriche on the Zeilenende can slight with Translate$(Text$,-
,) filter out.
You can your Text even same in Absätze decompose:
CompileMarkSeparation
now have You in the principle absatzformatierten Fließtext. whom can You now the Vorleseprogramm transfer, or, if your woman gladly yourself reading would like, in beliebiger Schriftgröße in a MultiEdit unterbringen. with eingestelltem automatischem Line break (negativer worth for Höhe) ought to the then free from problems readable his. In one RTF-Control can You it even yet discretionary format.
Greeting
Jörg
 
Windows XP SP2 XProfan X4
... und hier mal was ganz anderes als Profan ...
11/22/07  
 



Hm, would imho something performant there without regexp and without creep too with
CompileMarkSeparation
var in$=Hier_die_Funktion_die_den_Text_aus_PDF_liest()
in$=translate$(in$,-
,)
in$=translate$(in$,|,chr$(1))
in$=translate$(in$,!
,!|)
in$=translate$(in$,?
,?|)
in$=translate$(in$,.
,.|)
in$=translate$(in$,
,)
in$=translate$(in$,|,
) ich würde hier wahrscheinlich der Optik halber sogar
angeben, unwichtig...
in$=translate$(in$,chr$(1),|)
11/23/07  
 




Jörg
Sellmeyer
Obs faster is, käme on a attempt on. my first Loop becomes indeed only maximum 3times through. If z.B no ?
in the Text present is, even only 2time.
The second Loop can itself anyhow give as a present. was one fallacy for may part.
CompileMarkSeparation
Translate$(Text$,|,
)e>

reicht already.
Womit I then with five Lines be and evtl. less Durchläufen.
Also find I these RegEx-things somehow calm
 
Windows XP SP2 XProfan X4
... und hier mal was ganz anderes als Profan ...
11/23/07  
 



ok, testing! Steilvorlage for a TestSkelett: [...]  (simply into perfproc the present what To testing is)

would be still Real interestingly...

but the Coolnessfaktor is objectively unübersehbar.
 
11/23/07  
 




Thomas
Freier
cool!!! you are faster as the sonic. a little while ago was here still what about me have me on The Search to software made. will be The hints tommorrow try.
Jörg you're right, there are Problems, The many interested could.
 
Gruß Thomas
Windows XP SP2, XProfan X2
11/23/07  
 




Thomas
Freier
so, I had to it still try. either mach I a Error, or my Solution has The wenigsten.
kick-off the action: the attempt, an Zeitungsseite as PDF available, itself vorlesen To let. The whole Page is no trouble, that can Adobe too. but would like I only a item reading or once more vorlesen let, so does it not. and who would like already any Show listen!
in the application can the program only over The Keys gesteuert go up to Scannen, but so far be I not yet.
to that employment comes one old Laptop with 800*600 dissolution.
in the attachment too one of Adobe begot Textdatei (LN-Adobe.txt). where Adobe a fehlende structure, with the newspaper knew The none what under To understand is, bemängelt and then line by line over several Blöcke whom Text created. Grrrrr!

1.345 kB
Hochgeladen:11/23/07
Downloadcounter114
Download
 
Gruß Thomas
Windows XP SP2, XProfan X2
11/23/07  
 



Answer


Topictitle, max. 100 characters.
 

Systemprofile:

no Systemprofil laid out. [anlegen]

XProfan:

 Posting  Font  Smilies  ▼ 

Please register circa a Posting To verfassen.
 

Topic-Options

2.506 Views

Untitledvor 0 min.
RudiB.04/26/22
Ostfriesenjack08/11/21
Georg Teles07/07/21
Erasmus.Herold10/09/19
More...

Themeninformationen



Admins  |  AGB  |  Applications  |  Authors  |  Chat  |  Privacy Policy  |  Download  |  Entrance  |  Help  |  Merchantportal  |  Imprint  |  Mart  |  Interfaces  |  SDK  |  Services  |  Games  |  Search  |  Support

One proposition all XProfan, The there's!


My XProfan
Private Messages
Own Storage Forum
Topics-Remember-List
Own Posts
Own Topics
Clipboard
Log off
 Deutsch English Français Español Italia
Translations

Privacy Policy


we use Cookies only as Session-Cookies because of the technical necessity and with us there no Cookies of Drittanbietern.

If you here on our Website click or navigate, stimmst You ours registration of Information in our Cookies on XProfan.Net To.

further Information To our Cookies and moreover, How You The control above keep, find You in ours nachfolgenden Datenschutzerklärung.


all rightDatenschutzerklärung
i want none Cookie