ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • flairNLP 사용하기
    프로그래밍/자연어처리 2022. 9. 11. 18:35
    반응형

     

    flairnlp 추천을 받아서 설치해봤다 

    ner에 강점이 있긴 한데 embedding도 간단히 쓸 수 있을 것 같다. 

     

    우선 pip install flair로 설치 완료 

    $ pip install flair
    Collecting flair
      Using cached flair-0.11.3-py3-none-any.whl (401 kB)
    Requirement already satisfied: scikit-learn>=0.21.3 in ./anaconda3/lib/python3.9/site-packages (from flair) (1.0.2)
    Collecting deprecated>=1.2.4
      Using cached Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
    Collecting langdetect
      Using cached langdetect-1.0.9.tar.gz (981 kB)
    Requirement already satisfied: torch!=1.8,>=1.5.0 in ./anaconda3/lib/python3.9/site-packages (from flair) (1.7.1+cu110)
    Requirement already satisfied: gensim>=3.4.0 in ./anaconda3/lib/python3.9/site-packages (from flair) (4.1.2)
    Collecting gdown==4.4.0
      Using cached gdown-4.4.0.tar.gz (14 kB)
      Installing build dependencies ... done
      Getting requirements to build wheel ... done
        Preparing wheel metadata ... done
    Collecting huggingface-hub
      Using cached huggingface_hub-0.9.1-py3-none-any.whl (120 kB)
    Requirement already satisfied: regex in ./anaconda3/lib/python3.9/site-packages (from flair) (2022.3.15)
    Requirement already satisfied: tabulate in ./anaconda3/lib/python3.9/site-packages (from flair) (0.8.9)
    Collecting sentencepiece==0.1.95
      Downloading sentencepiece-0.1.95-cp39-cp39-manylinux2014_x86_64.whl (1.2 MB)
         |████████████████████████████████| 1.2 MB 31.8 MB/s
    Collecting janome
      Using cached Janome-0.4.2-py2.py3-none-any.whl (19.7 MB)
    Collecting conllu>=4.0
      Using cached conllu-4.5.2-py2.py3-none-any.whl (16 kB)
    Collecting ftfy
      Using cached ftfy-6.1.1-py3-none-any.whl (53 kB)
    Collecting more-itertools
      Using cached more_itertools-8.14.0-py3-none-any.whl (52 kB)
    Requirement already satisfied: matplotlib>=2.2.3 in ./anaconda3/lib/python3.9/site-packages (from flair) (3.5.1)
    Collecting wikipedia-api
      Using cached Wikipedia-API-0.5.4.tar.gz (18 kB)
    Collecting bpemb>=0.3.2
      Using cached bpemb-0.3.3-py3-none-any.whl (19 kB)
    Collecting segtok>=1.5.7
      Using cached segtok-1.5.11-py3-none-any.whl (24 kB)
    Requirement already satisfied: python-dateutil>=2.6.1 in ./anaconda3/lib/python3.9/site-packages (from flair) (2.8.2)
    Requirement already satisfied: tqdm>=4.26.0 in ./anaconda3/lib/python3.9/site-packages (from flair) (4.64.0)
    Collecting hyperopt>=0.2.7
      Using cached hyperopt-0.2.7-py2.py3-none-any.whl (1.6 MB)
    Collecting konoha<5.0.0,>=4.0.0
      Using cached konoha-4.6.5-py3-none-any.whl (20 kB)
    Collecting pptree
      Using cached pptree-3.1.tar.gz (3.0 kB)
    Collecting sqlitedict>=1.6.0
      Using cached sqlitedict-2.0.0.tar.gz (46 kB)
    Requirement already satisfied: lxml in ./anaconda3/lib/python3.9/site-packages (from flair) (4.8.0)
    Collecting mpld3==0.3
      Using cached mpld3-0.3.tar.gz (788 kB)
    Collecting transformers>=4.0.0
      Using cached transformers-4.21.3-py3-none-any.whl (4.7 MB)
    Requirement already satisfied: requests[socks] in ./anaconda3/lib/python3.9/site-packages (from gdown==4.4.0->flair) (2.27.1)
    Requirement already satisfied: beautifulsoup4 in ./anaconda3/lib/python3.9/site-packages (from gdown==4.4.0->flair) (4.11.1)
    Requirement already satisfied: filelock in ./anaconda3/lib/python3.9/site-packages (from gdown==4.4.0->flair) (3.6.0)
    Requirement already satisfied: six in ./anaconda3/lib/python3.9/site-packages (from gdown==4.4.0->flair) (1.16.0)
    Requirement already satisfied: numpy in ./anaconda3/lib/python3.9/site-packages (from bpemb>=0.3.2->flair) (1.21.5)
    Requirement already satisfied: wrapt<2,>=1.10 in ./anaconda3/lib/python3.9/site-packages (from deprecated>=1.2.4->flair) (1.12.1)
    Requirement already satisfied: smart-open>=1.8.1 in ./anaconda3/lib/python3.9/site-packages (from gensim>=3.4.0->flair) (5.1.0)
    Requirement already satisfied: scipy>=0.18.1 in ./anaconda3/lib/python3.9/site-packages (from gensim>=3.4.0->flair) (1.7.3)
    Collecting py4j
      Using cached py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
    Requirement already satisfied: future in ./anaconda3/lib/python3.9/site-packages (from hyperopt>=0.2.7->flair) (0.18.2)
    Requirement already satisfied: cloudpickle in ./anaconda3/lib/python3.9/site-packages (from hyperopt>=0.2.7->flair) (2.0.0)
    Requirement already satisfied: networkx>=2.2 in ./anaconda3/lib/python3.9/site-packages (from hyperopt>=0.2.7->flair) (2.7.1)
    Collecting overrides<4.0.0,>=3.0.0
      Using cached overrides-3.1.0.tar.gz (11 kB)
    Collecting importlib-metadata<4.0.0,>=3.7.0
      Using cached importlib_metadata-3.10.1-py3-none-any.whl (14 kB)
    Requirement already satisfied: zipp>=0.5 in ./anaconda3/lib/python3.9/site-packages (from importlib-metadata<4.0.0,>=3.7.0->konoha<5.0.0,>=4.0.0->flair) (3.7.0)
    Requirement already satisfied: kiwisolver>=1.0.1 in ./anaconda3/lib/python3.9/site-packages (from matplotlib>=2.2.3->flair) (1.3.2)
    Requirement already satisfied: cycler>=0.10 in ./anaconda3/lib/python3.9/site-packages (from matplotlib>=2.2.3->flair) (0.11.0)
    Requirement already satisfied: fonttools>=4.22.0 in ./anaconda3/lib/python3.9/site-packages (from matplotlib>=2.2.3->flair) (4.25.0)
    Requirement already satisfied: pyparsing>=2.2.1 in ./anaconda3/lib/python3.9/site-packages (from matplotlib>=2.2.3->flair) (3.0.4)
    Requirement already satisfied: packaging>=20.0 in ./anaconda3/lib/python3.9/site-packages (from matplotlib>=2.2.3->flair) (21.3)
    Requirement already satisfied: pillow>=6.2.0 in ./anaconda3/lib/python3.9/site-packages (from matplotlib>=2.2.3->flair) (9.0.1)
    Requirement already satisfied: charset-normalizer~=2.0.0 in ./anaconda3/lib/python3.9/site-packages (from requests[socks]->gdown==4.4.0->flair) (2.0.4)
    Requirement already satisfied: idna<4,>=2.5 in ./anaconda3/lib/python3.9/site-packages (from requests[socks]->gdown==4.4.0->flair) (3.3)
    Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./anaconda3/lib/python3.9/site-packages (from requests[socks]->gdown==4.4.0->flair) (1.26.9)
    Requirement already satisfied: certifi>=2017.4.17 in ./anaconda3/lib/python3.9/site-packages (from requests[socks]->gdown==4.4.0->flair) (2021.10.8)
    Requirement already satisfied: joblib>=0.11 in ./anaconda3/lib/python3.9/site-packages (from scikit-learn>=0.21.3->flair) (1.1.0)
    Requirement already satisfied: threadpoolctl>=2.0.0 in ./anaconda3/lib/python3.9/site-packages (from scikit-learn>=0.21.3->flair) (2.2.0)
    Requirement already satisfied: typing-extensions in ./anaconda3/lib/python3.9/site-packages (from torch!=1.8,>=1.5.0->flair) (4.1.1)
    Requirement already satisfied: pyyaml>=5.1 in ./anaconda3/lib/python3.9/site-packages (from transformers>=4.0.0->flair) (6.0)
    Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
      Downloading tokenizers-0.12.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
         |████████████████████████████████| 6.6 MB 134.7 MB/s
    Requirement already satisfied: soupsieve>1.2 in ./anaconda3/lib/python3.9/site-packages (from beautifulsoup4->gdown==4.4.0->flair) (2.3.1)
    Requirement already satisfied: wcwidth>=0.2.5 in ./anaconda3/lib/python3.9/site-packages (from ftfy->flair) (0.2.5)
    Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in ./anaconda3/lib/python3.9/site-packages (from requests[socks]->gdown==4.4.0->flair) (1.7.1)
    Building wheels for collected packages: gdown, mpld3, overrides, sqlitedict, langdetect, pptree, wikipedia-api
      Building wheel for gdown (PEP 517) ... done
      ...
      Successfully built gdown mpld3 overrides sqlitedict langdetect pptree wikipedia-api
    Installing collected packages: tokenizers, sentencepiece, py4j, overrides, importlib-metadata, huggingface-hub, wikipedia-api, transformers, sqlitedict, segtok, pptree, mpld3, more-itertools, langdetect, konoha, janome, hyperopt, gdown, ftfy, deprecated, conllu, bpemb, flair
      Attempting uninstall: importlib-metadata
        Found existing installation: importlib-metadata 4.11.3
        Uninstalling importlib-metadata-4.11.3:
          Successfully uninstalled importlib-metadata-4.11.3
    ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    sphinx 4.4.0 requires importlib-metadata>=4.4; python_version < "3.10", but you have importlib-metadata 3.10.1 which is incompatible.
    Successfully installed bpemb-0.3.3 conllu-4.5.2 deprecated-1.2.13 flair-0.11.3 ftfy-6.1.1 gdown-4.4.0 huggingface-hub-0.9.1 hyperopt-0.2.7 importlib-metadata-3.10.1 janome-0.4.2 konoha-4.6.5 langdetect-1.0.9 more-itertools-8.14.0 mpld3-0.3 overrides-3.1.0 pptree-3.1 py4j-0.10.9.7 segtok-1.5.11 sentencepiece-0.1.95 sqlitedict-2.0.0 tokenizers-0.12.1 transformers-4.21.3 wikipedia-api-0.5.4

    ner 모델을 로드한 후 바로 문장에서 location 인식이 가능하다. 

    $ python3
    Python 3.9.12 (main, Apr  5 2022, 06:56:58)
    [GCC 7.5.0] :: Anaconda, Inc. on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from flair.data import Sentence
    >>> from flair.models import SequenceTagger
    >>>
    >>> # make a sentence
    >>> sentence = Sentence('I love Berlin .')
    >>>
    >>> # load the NER tagger
    >>> tagger = SequenceTagger.load('ner')
    /home/anaconda3/lib/python3.9/site-packages/huggingface_hub/file_download.py:621: FutureWarning: `cached_download` is the legacy way to download files from the HF hub, please consider upgrading to `hf_hub_download`
      warnings.warn(
    2022-09-12 19:43:32,193 loading file /home/.flair/models/ner-english/4f4cdab26f24cb98b732b389e6cebc646c36f54cfd6e0b7d3b90b25656e4262f.8baa8ae8795f4df80b28e7f7b61d788ecbb057d1dc85aacb316f1bd02837a4a4
    2022-09-12 19:43:33,372 SequenceTagger predicts: Dictionary with 20 tags: <unk>, O, S-ORG, S-MISC, B-PER, E-PER, S-LOC, B-ORG, E-ORG, I-PER, S-PER, B-MISC, I-MISC, E-MISC, I-ORG, B-LOC, E-LOC, I-LOC, <START>, <STOP>
    >>>
    >>> # run NER over sentence
    >>> tagger.predict(sentence)
    >>> print(sentence)
    Sentence: "I love Berlin ." → ["Berlin"/LOC]
    >>>

     

    embedding도 이어서 해보겠다. 

     

     

     

    참고:

    https://github.com/flairNLP/flair

     

    GitHub - flairNLP/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)

    A very simple framework for state-of-the-art Natural Language Processing (NLP) - GitHub - flairNLP/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)

    github.com

    https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/CLASSIC_WORD_EMBEDDINGS.md

     

    GitHub - flairNLP/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)

    A very simple framework for state-of-the-art Natural Language Processing (NLP) - GitHub - flairNLP/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)

    github.com

     

    728x90
    반응형
Designed by Tistory.