COMPANY NEWS


   24 November 2010
ESTARD Data Miner 3.1.325 has been released.

Analyze Oracle, MSSQL, MySQL and Sybase databases with new EDM version.

DOWNLOADS


Download Estard Data Miner demo version
Download archived Estard Data Miner demo version

Download ESTARD Data Miner QuickStart Guide

View Estard Data Miner Online Help

ESTARD Data Miner FAQ

ABOUT DATA MINING


 

DATA MINING TECHNIQUES


 

Business Intelligence


Our Partners:
System Pulse Software Network Monitoring Tool

ESTARD Data Miner FAQ


Here you can find answers to such questions:
  1. What is a "Class"?
  2. What field should be used as the "Class field"?
  3. How do I create numeral classes?
  4. Why do I get too many classes after selecting the field to analyze?
  5. Why all classes are analyzed when I select only one class for the Statistics Query?
  6. Why EDM detects empty classes or values in statistics?
  7. What number of classes is recommended for use for rules and decision trees creation?
  8. What is the difference between the Learning and Analyzed databases?
  9. Why ID field is necessary for analyzing two or more databases together?
  10. How values in statistics are divided into groups?
  11. What is the difference between Rules and Decision Trees?
  12. How to use settings for Rules and Decision Trees creation?
  13. Why no Rules are created after I press "Create Rules" button?
  14. What algorithm is used in ESTARD Data Miner?

If you have further questions, please don't hesitate to ask - contact us.


  1. What is a Class?

    A class is a unique value from the analyzed column. For example, if some column is of logical type and it contains values "True" and "False" this means it contains two classes.
  2. What field should be used as the "Class field"?

    The best way to receive valuable data from data mining is to analyze the field that contains key information in record. It can be of any type, but not the ID field. Besides, setting Text fields with high level of unique values will probably not return good results. For example, field "Customer name", containing names of the customers of a company, will probably contain lots of unique values. If you set this field as a Class field, you will receive Rules equal to records, which are the best descriptions of every customer. This doesn't apply to numeral fields, because there Classes are created manually, and you can create any number of classes.
  3. How do I create numeral classes?

    Creating a numeral class means creating intervals you want to analyze. To decide what intervals you want to analyze use the "View Field Values" button to analyze how many values are met in equal intervals. Use "View Table" button to view values met in the table selected for analysis. You also can see minimum and maximum values met in the Class field above the input fields. The intervals you create should not intersect. For example, intervals "1..10" and "11..20" do not intersect, while "1..10" and "6..20" intersect. If you create intersecting intervals, you will be asked to change the inputted value.
  4. Why do I get too many classes?

    If you have too many classes, then probably you've used a text field as the Class field, because for numeric fields you will be asked to create classes by yourself and you can create as many classes as you wish. If you have too many classes in a text field, then you can split them into groups
    and analyze step by step. In case if this column contains numeric data, and is detected as a text column, this means that the column contains some incorrect records, containing text instead of numbers.
  5. Why all classes are analyzed when I select only one class for the Statistics Query?

    All classes have to be analyzed during the Statistics query, because EDM has to detect differences between them. After performing the statistics query you can select values you want to analyze further.
  6. Why EDM detects empty classes or values in statistics?

    For better results, it is good to use "clean" databases. Empty records are automatically detected in database, and though they are not filled, they are also been analyzed. To avoid this problem you can deselect empty values, if such are detected in analyzed column, before creating rules and trees.
  7. What number of classes is recommended for use for rules and decision trees creation?

    The more classes are detected in the analyzed column, the more precised results will be obtained during the statistics query. But their number shouldn't be too big, for example, several hundreds of classes used for rules creation will be analyzed much longer, than 20 classes. So the number of classes should be reasonable from the point of view of statistics and performance.
  8. What is the difference between the Learning and Analyzed databases?

    The Learning database is used for creating statistics, if-then rules and decision trees. The Analyzed database is not used for these operations. It is used for selecting records from the dataset with the use of rules or trees. This additional feature allows you to easily apply the results of data mining to some other database.
  9. Why ID field is necessary for analyzing two or more databases together?

    The ID field is not necessary if you want to analyze only one table. But it is vital for estimating which record in one table corresponds to some other record in another table. Without an ID field such relation cannot be set.
  10. How values in statistics are divided into groups?

    The values are automatically grouped in such way, that they detect differences between classes in the best way.
  11. What is the difference between Rules and Decision Trees?

    These two data mining methods allow to look at the same problem from different points of view. Rules can be represented in the form of tree, a decision tree - in a form of rules, but these two methods differ not in the way of representation. They differ in the way of obtaining. Used together, they create a good combination of models for understanding relations in data.
  12. How to use settings for Rules and Decision Trees creation?

    It is better to start from high values for "Probability" and "Rule cases" settings. If the number of obtained rules is low, or not created at all, low down these values and create rules once again.
  13. Why no Rules are created after I press "Create Rules" button?

    This might happen if the "Probability" or "Rule cases" settings ("Query options" dialog) values are too high for the dataset you are working on. For example, the number of records for some class might be lower than "Rule cases", or the analyzed data might contain too many unique values, so the probability of rules might be lower than the one in the settings.
  14. What algorithm is used for rules creation?

    The ESTARD Data Miner algorithm used for creation of rules was specially created by our analysts. It allows to obtain ALL if-then rules, of ANY length. So the length of rules depends only on the number of fields you select for rules creation.


(c) 2004-2011 ESTARD.
ALL RIGHTS RESERVED.
HOME : PRODUCTS : DATA MINING : ORDER : FAQ : ABOUT US : CONTACT US : SITE MAP