Uploaded image for project: 'Public H2O 3'
  1. PUBDEV-5799

Add direct support for text data in H2O AutoML using Word2Vec

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 3.26.0.MAYBE
    • Component/s: AutoML
    • Labels:
      None

      Description

      We should be able to take a 2-column H2OFrame as training data for AutoML where one column is text and one column is a label. From here, we can train a Word2Vec model, then transform the text into vectors and then proceed with the typical AutoML process.

      I have to think about how we would store the W2V model in the AutoML object for future use, but there's probably a reasonable way to make this work.

        Attachments

          Activity

            People

            • Assignee:
              michalk Michal Kurka
              Reporter:
              erin Erin LeDell
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: