Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Creating/Converting .arff files

  1. #1
    Join Date
    Jul 2016
    Posts
    1

    Default Creating/Converting .arff files

    Hello everyone,

    I'm new to weka and I'm trying to convert some .txt files to .arff format.
    For example the brown corpus: http://www.sls.hawaii.edu/bley-vroman/brown_corpus.html

    I've tried using this tool for the task: http://weka.wikispaces.com/Text+cate...tion+with+WEKA

    But when I try to use the provided command "java weka.core.converters.TextDirectoryLoader -dir text_example > text_example.arff" I get
    the following error: Main class weka.core.converters.TextDirectoryLoader can't be found or loaded.

    (I'm using Windows 7 x64 and Weka 3.9.0 developer version, opened the command prompt, navigated to the folder which contains my subfolder with the .txt file and entered this command)



    I also tried the same with the weka explorer: Open File -> Chose the folder with the subfolder (with .txt) -> After error message I chose TextDirectoryLoader -> Click ok with provided parameters

    Unfortunately I only get a .arff file with 1kb this way...

    Is there a better way to convert a .txt file to a .arff file? Am I doing something wrong with the TextDirectoryLoader?

    I hope someone can help me because it would take me weeks to create the .arff files on my own
    Thanks in advance!

  2. #2
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    The TextDirectoryLoader expects individual documents to be housed in separate files contained within a directory that corresponds to the class for that collection of documents. It looks like your browns corpus is one giant file - so you'll need to split it up into individual documents first.

    Cheers,
    Mark.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.