English
Source / code snippets

Worthäufigkeiten in Texten

 

p.specht

Linguisten and Philologen, The old Texts analyse, create often Text-Statistiken, u.a. over The Häufigkeiten the verwendeten words and Textelemente. the can then Rückschlüsse on the Verfasser To. but too Crypto-Analytiker need derartige Statistiken, about as preparation, circa at all first times the zugrundeliegende Language To detect...

here goes it though only ums principle, z.B. How one to Häufigkeiten absteigend indexiert, but the aufsteigende Order into Häufigkeitsgruppen beibehält. On Dateioperationen watts deliberate waived, it went Yes here only around the Algorithmus...
Window Title "TEXTANALYSE: frequency of Textelementen ermitteln"
'(CL) CopyLeft 2014-10 by P.woodpecker, Wien; without each Gewähr!
Init:
Window Style 24
Declare Text$,t$[],v&[],n&,w$[],h&[],m&,i&,j&,wd$,gef&,maxh&,maxw$
Declare kk$,kl&,key$[]
Beginn:
Font 2:randomize:CLS rnd(8^8)
TEXT_EINLESEN
' TEXTVORBEREITUNG, changed whom eingelesenen Text! alternative: On Duplikat works...
text$=translate $(Text$,"\n"," "):Text$=translate $(Text$,"\t"," ")
text$=translate $(Text$,"."," . "):Text$=translate $(Text$,","," , ")
text$=translate $(Text$,"!"," ! "):Text$=translate $(Text$,"?"," ? ")
text$=translate $(Text$,"-"," - "):Text$=translate $(Text$,"("," ( ")
text$=translate $(Text$,")"," )"):Text$=translate $(Text$,"'"," ' ")
text$=translate $(Text$,"´"," ´ "):Text$=translate $(Text$,"'"," ' ")
text$=translate $(Text$,":"," : "):Text$=translate $(Text$,"  "," ")
text$=translate $(Text$,"  "," "):Text$=translate $(Text$,""," \ \ ")
text$=translate $(Text$,"//"," / / ")
t$[]=explode(Text$," "):t$[]=upper$(t $[&index])
n&=sizeof(t $[]):print " ";n&;" Textteile aufbereitet."
' UNIFIZIERUNG: with Wörtern only 1. Vorkommen Save, for frequency count:
m&=-1:maxh&=0:whileloop 0,n&-1:i&=&Loop:wd$=t $[i&]
locate 1,1:print tab(69);mid$("--\\||//",1+(m& mod 8),1);'AKTIVITÄTS-MARKER

if m&=-1:w$[0]=wd$:h&[0]=1:m&=0

    else :gef&=0:whileloop 0,m&:j&=&Loop:if w$[j&]=wd$:h&[j&]=h&[j&]+1
    ' The following row becomes only used, if frequency h&[] in the Schlüsselfeld k$[] 'invertiert'...

    if maxh&<h&[j&]:maxh&=h&[j&]:maxw$=w$[j&]:endif' ... go must { h&[]=maxh-h&[&index] }

        gef&=1:break :endif :endwhile :case gef&:continue : inc m&:w$[m&]=wd$:h&[m&]=1:endif :endwhile
        print "\n ";int(m&+1);" different words or characters found.\n"
        print " Häufigstes Textelement:  ";chr $(34);maxw$;chr $(34),"<";maxh&;">"
        ' SCHLÜSSELFELD SCHAFFEN
        ' target: After frequency h&[] absteigend, but within Häufigkeitsgruppe words w$[] ascending!
        setsize key$[],m&+1:kl&=int(lg(m&)):kk$=mkstr$("0",kl&):kl&=kl&+1
        ' TRICK: there only aufsteigende Indexierung available, becomes Schlüsselfeld gespiegelt on Maxh&:
        h&[]= maxh&-h&[&index]
        key$[] = right$( kk$+st$(h&[&index]),kl&)+w$[&index]
        h&[]= maxh&-h&[&index]' in Original rückwandeln, alternatively could one separates aray set up
        ' INDEXIEREN DES SCHLÜSSELFELDES: to area h&[] absteigend and area w$[] ascending
        setsize v&[],m&+1
        v&[]=QuickIndex$Up(key$[])
        ' straighten up
        Clear key$[]' not More needed
        '{ AUSGABE
        print "\n Text-items to frequency: "
        print "----------------------------------------------------------"

        whileloop 0,m&

            print " ";int(1+&Loop);". ";tab(7);" <";h&[v&[&Loop]];"> ";w$[v&[&Loop]];" "

            if %csrlin>30:waitinput:cls rnd(8^8):endif'<<<je to Monitorauflösung

            endwhile

            print "----------------------------------------------------------"
            '}
            BEEP:Waitinput
            Print " see again!"
            waitinput 6000
            END

            proc QuickIndex$Up :parameters a$[]

                declare n&,p&,l&,r&,s&,sl&[],sr&[],w$,t&,x$,i&,j&,v&[]
                n&=sizeof(a$[]):s&=1:setsize v&[],n&:v&[]=&index:s&=1:sl&[1]=0:sr&[1]=n&-1

                while s&>0:l&=sl&[s&]:r&=sr&[s&]:dec s&:while l&<r&:i&=l&:j&=r&:p&=(l&+r&)\2

                    if a$[v&[l&]]>a$[v&[p&]]:w$=v&[l&]:v&[l&]=v&[p&]:v&[p&]=w$:endif

                        if a$[v&[l&]]>a$[v&[r&]]:w$=v&[l&]:v&[l&]=v&[r&]:v&[r&]=w$:endif

                            if a$[v&[p&]]>a$[v&[r&]]:w$=v&[p&]:v&[p&]=v&[r&]:v&[r&]=w$:endif :x$=a$[v&[p&]]

                                while i&<=j&:while a$[v&[i&]]<x$:inc i&:endwhile :while x$<a$[v&[j&]]:dec j&:endwhile

                                    if i&<=j&:t&=v&[i&]:v&[i&]=v&[j&]:v&[j&]=t&:inc i&:dec j&:endif :endwhile

                                        if (j&-l&)<(r&-i&):if i&<r&:inc s&:sl&[s&]=i&:sr&[s&]=r&:endif :r&=j&:else

                                            if l&<j&:inc s&:sl&[s&]=l&:sr&[s&]=j&:endif :l&=i&:endif :endwhile :endwhile

                                                return v&[]

                                            endproc

                                            proc TEXT_EINLESEN'Eingabeteil: Text$ z.B. with LOREM IPSUM initialisieren

                                                Text$="The quick brown fox jumps over the lazy dog. \n\n" +\
                                                "Lorem Ipsum is a plainer demonstration-Text for Print- and Schriftindustrie. "+\
                                                "Lorem Ipsum is in the industry already the standard demonstration-Text since 1500, as "+\
                                                "ein unbekannter author a hand fully Wörter took and these confusion "+\
                                                "warf for a Musterbuch to create. it has not only 5 centenaries survive, "+\
                                                "sondern too in saying into electronic Schriftbearbeitung geschafft (notice, "+\
                                                "nahezu unchanged). famous watts it 1960, with the attend of 'Letraset', "+\
                                                "welches Passagen of Lorem Ipsum encompassed, so How Desktop software How 'Aldus PageMaker' "+\
                                                "- ditto with Lorem Ipsum. " +\
                                                "Es is a long erwiesener Fakt, that one reader of Text abgelenkt becomes, if it itself "+\
                                                "ein Layout ansieht. The point, Lorem Ipsum To benefit, is, that it more or less "+\
                                                "die normal order of letters darstellt and accordingly to lesbarer Language looks. "+\
                                                "Viele Desktop Publisher and Webeditoren benefit in the meantime Lorem Ipsum as whom "+\
                                                "Standardtext, too The Search Internet to 'lorem ipsum' power many Websites visible, "+\
                                                "wo these yet always vorkommen. in the meantime there several versions the Lorem Ipsum, "+\
                                                "einige random, others deliberate (affect of joke and of their own Geschmacks). \n\n" +\
                                                \
                                                "Glauben or not believe, Lorem Ipsum isn't only one zufälliger Text. he's "+\
                                                "Wurzeln from the latin Literatur of 45 v. Chr, what it over 2000 years old power. "+\
                                                "Richar McClintock, one Lateinprofessor the Hampden-Sydney college in Virgnia investigate "+\
                                                "einige undeutliche words, 'consectetur', of/ one Lorem Ipsum Passage and found a "+\
                                                "unwiederlegbare fountain. Lorem Ipsum komm from the Sektion 1.10.32 and 1.10.33 the "+\
                                                "'de Finibus Bonorum et Malorum' (The Extreme of well and evilly) of Cicero, written "+\
                                                "45 v. Chr. this book is treatise the Ethiktheorien, very famous During low the Renaissance. "+\
                                                "Die first row the Lorem Ipsum, ''Lorem ipsum dolor sit amet...'', comes a row "+\
                                                "der Sektion 1.10.32. \n"+\
                                                \
                                                "Der Standardteil of Lorem Ipsum, used since 1500, is reproduziert for, The it "+\
                                                "interessiert. Sektion 1.10.32 and 1.10.33 of ''de Finibus Bonorum et Malroum'' of Cicero "+\
                                                "sind too reproduziert in her Originalform, derivational from the english Version from "+\
                                                "By 1914 (H. Rackham).\n\n"+\
                                                \
                                                "Es gives many Variationen the Passages the Lorem Ipsum, but the Hauptteil erlitt Changes "+\
                                                "in anybody shape, through humour or casually Wörter which not even ansatzweise "+\
                                                "glaubwürdig looks. If you a Passage the Lorem Ipsum uses, should You pay attention, that "+\
                                                "in the middle the Textes no ungewollten Wörter stand. many the Generatoren Internet "+\
                                                "neigen moreover, vorgefertigte items To repeat - what it necessary made, a right "+\
                                                "Generator to develop. we benefit one Wörterbuch from over 200 latin Wörter, "+\
                                                "kombiniert with of/ one handful Kunstsätzen, which the Lorem Ipsum believable power. "+\
                                                "Das generierte Lorem Ipsum is moreover spare of replays, humour or unqualifizierten "+\
                                                "Bemerkungen etc. The Text selbs can on https://de.lipsum.com/feed/html created werden"

                                            Endproc

                                            PROGEND
 
XProfan 11
Computer: Gerät, daß es in Mikrosekunden erlaubt, 50.000 Fehler zu machen, zB 'daß' statt 'das'...
05/15/21  
 



Zum Quelltext


Topictitle, max. 100 characters.
 

Systemprofile:

no Systemprofil laid out. [anlegen]

XProfan:

 Posting  Font  Smilies  ▼ 

Please register circa a Posting To verfassen.
 

Topic-Options

1.493 Views

Untitledvor 0 min.
Erasmus.Herold01/09/23
Rschnett08/06/22
Michael W.03/30/22
RudiB.03/08/22
More...

Themeninformationen

this Topic has 1 subscriber:

p.specht (1x)


Admins  |  AGB  |  Applications  |  Authors  |  Chat  |  Privacy Policy  |  Download  |  Entrance  |  Help  |  Merchantportal  |  Imprint  |  Mart  |  Interfaces  |  SDK  |  Services  |  Games  |  Search  |  Support

One proposition all XProfan, The there's!


My XProfan
Private Messages
Own Storage Forum
Topics-Remember-List
Own Posts
Own Topics
Clipboard
Log off
 Deutsch English Français Español Italia
Translations

Privacy Policy


we use Cookies only as Session-Cookies because of the technical necessity and with us there no Cookies of Drittanbietern.

If you here on our Website click or navigate, stimmst You ours registration of Information in our Cookies on XProfan.Net To.

further Information To our Cookies and moreover, How You The control above keep, find You in ours nachfolgenden Datenschutzerklärung.


all rightDatenschutzerklärung
i want none Cookie