5 votos

La creación de numpy.matriz con un número variable de campos a prueba de arcpy.da.ExtendTable rendimiento?

Hubo un ArcPy Café blog de la publicación el derecho de Agregar Campos: Consejos de interpretación que propugna:

Dos enfoques para ayudar a aumentar el rendimiento de la adición de numerosos campos a una tabla o clase de entidad.

...

2. Utilice los datos de acceso y NumPy módulos. El acceso a los datos del módulo de función denominada ExtendTable() se une el contenido de una Colección estructurada matriz de una tabla basada en un atributo común de campo.
Este es el más rápido enfoque, sin embargo, los tipos de campos que se pueden agregar utilizando numpy son limitados. No hay soporte para la adición de blobs, raster, y la fecha campos. Además, el alias de campo no puede ser definido o alterado.

He hecho algunas pruebas de rendimiento, el uso de ArcGIS 10.2 para Escritorio, de este método de comparación múltiple (tres y nueve) Añadir Campos en el archivo de clases de entidad de geodatabase de 16 a 1.000.000 de polígonos. Mi hallazgos iniciales sugieren que las ganancias para las nueve y especialmente en tres campos se pierde a medida que el número de polígonos que se aumenta. En consecuencia, tengo la sospecha de que este método sólo debe ser defendido durante muchos campos y no tantas características.

A continuación es mi actual código de prueba, y su salida de corriente, pero estos se presentan puramente por interés/experiencia.

Lo que me gustaría saber es cómo hacer que el número de campos que se coloca en la Colección de la Matriz de convertirse configurable?

Sospecho que hay un Python técnica desconocida para mí, pero mis investigaciones todavía no han dado vuelta para arriba, así que voy a tratar de explicar ...

Si tengo una variable numFields = 10 puedo usar algo como for i in range(numFields) a expandir 'TEST_INTEGER'+str(i) a convertirse en ...

narray = numpy.array([],
                     numpy.dtype([('_ID', numpy.int),
                                  ('TEST_INTEGER'+str(0), numpy.int),
                                  ('TEST_INTEGER'+str(1), numpy.int),
                                  ('TEST_INTEGER'+str(2), numpy.int),
                                  ('TEST_INTEGER'+str(3), numpy.int),
                                  ('TEST_INTEGER'+str(4), numpy.int),
                                  ('TEST_INTEGER'+str(5), numpy.int),
                                  ('TEST_INTEGER'+str(6), numpy.int),
                                  ('TEST_INTEGER'+str(7), numpy.int),
                                  ('TEST_INTEGER'+str(8), numpy.int),
                                  ('TEST_INTEGER'+str(9), numpy.int),
                                  ]))

import arcpy,numpy,time

fc = r"C:\polygeo\Projects\test.gdb\testFishnet"
fc2 = r"C:\polygeo\Projects\test.gdb\testFishnet2"
cellWidthHeightList = ["0.25","0.1","0.025","0.01","0.0025","0.001"]

for cellWidthHeight in cellWidthHeightList:
    numCells = (1 / float(cellWidthHeight)) ** 2
    print "Creating fishnet of {0} polygons".format(str(int(numCells)))
    if not arcpy.Exists(r"C:\polygeo\Projects\test.gdb"):
        arcpy.CreateFileGDB_management(r"C:\polygeo\Projects","test.gdb")
    if arcpy.Exists(r"C:\polygeo\Projects\test.gdb\testFishnet"):
        arcpy.Delete_management(r"C:\polygeo\Projects\test.gdb\testFishnet")
        arcpy.CreateFishnet_management("C:/polygeo/Projects/test.gdb/testFishnet",
                                       "0 0","0 1",cellWidthHeight,cellWidthHeight,
                                       "#","#","1 1","NO_LABELS","#","POLYGON")
    if arcpy.Exists(r"C:\polygeo\Projects\test.gdb\testFishnet2"):
        arcpy.Delete_management(r"C:\polygeo\Projects\test.gdb\testFishnet2")
        arcpy.Copy_management(r"C:\polygeo\Projects\test.gdb\testFishnet",
                              r"C:\polygeo\Projects\test.gdb\testFishnet2",
                              "FeatureClass")

    start = time.clock()
    arcpy.AddField_management(fc,"TEST_INTEGER","LONG","#","#","#","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_TEXT","TEXT","#","#","100","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_FLOAT","DOUBLE","#","#","#","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_INTEGER2","LONG","#","#","#","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_TEXT2","TEXT","#","#","100","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_FLOAT2","DOUBLE","#","#","#","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_INTEGER3","LONG","#","#","#","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_TEXT3","TEXT","#","#","100","#",
                              "NULLABLE","NON_REQUIRED","#")
    arcpy.AddField_management(fc,"TEST_FLOAT3","DOUBLE","#","#","#","#",
                              "NULLABLE","NON_REQUIRED","#")
    elapsed = (time.clock() - start)
    print " - AddField took {0} seconds to add NINE fields".format(elapsed)

    start = time.clock()
    narray = numpy.array([],
                         numpy.dtype([('_ID', numpy.int),
                                      ('TEST_INTEGER', numpy.int),
                                      ('TEST_TEXT', '|S100'),
                                      ('TEST_FLOAT', numpy.float),
                                      ('TEST_INTEGER2', numpy.int),
                                      ('TEST_TEXT2', '|S100'),
                                      ('TEST_FLOAT2', numpy.float),
                                      ('TEST_INTEGER3', numpy.int),
                                      ('TEST_TEXT3', '|S100'),
                                      ('TEST_FLOAT3', numpy.float),
                                      ]))
    print " - numpy.array() took {0} seconds".format((time.clock() - start))
    arcpy.da.ExtendTable(fc2, "OID@", narray, "_ID")
    print " - arcpy.da.ExtendTable() took {0} seconds".format((time.clock() - start))
    elapsed2 = (time.clock() - start)
    print " - NumPy and ExtendTable took {0} seconds to add NINE fields".format(elapsed2)
    print " - NumPyExtendTable:AddField ratio = {0}".format(str(float(elapsed2) / float(elapsed)))

>>> ================================ RESTART ================================
>>> 
Creating fishnet of 16 polygons
 - AddField took 6.15530941159 seconds to add NINE fields
 - numpy.array() took 8.66656530336e-05 seconds
 - arcpy.da.ExtendTable() took 0.251475596777 seconds
 - NumPy and ExtendTable took 0.25444199483 seconds to add NINE fields
 - NumPyExtendTable:AddField ratio = 0.0413369950748
Creating fishnet of 100 polygons
 - AddField took 7.01486277938 seconds to add NINE fields
 - numpy.array() took 6.49992397754e-05 seconds
 - arcpy.da.ExtendTable() took 0.523825072221 seconds
 - NumPy and ExtendTable took 0.554671962901 seconds to add NINE fields
 - NumPyExtendTable:AddField ratio = 0.0790709640866
Creating fishnet of 1600 polygons
 - AddField took 6.93555420404 seconds to add NINE fields
 - numpy.array() took 2.96487760387e-05 seconds
 - arcpy.da.ExtendTable() took 0.297160559526 seconds
 - NumPy and ExtendTable took 0.334817165881 seconds to add NINE fields
 - NumPyExtendTable:AddField ratio = 0.0482754738887
Creating fishnet of 10000 polygons
 - AddField took 6.60505587654 seconds to add NINE fields
 - numpy.array() took 2.77482134763e-05 seconds
 - arcpy.da.ExtendTable() took 0.612328569257 seconds
 - NumPy and ExtendTable took 0.672550554964 seconds to add NINE fields
 - NumPyExtendTable:AddField ratio = 0.101823598094
Creating fishnet of 160000 polygons
 - AddField took 9.54403527444 seconds to add NINE fields
 - numpy.array() took 3.57305762435e-05 seconds
 - arcpy.da.ExtendTable() took 5.36002166641 seconds
 - NumPy and ExtendTable took 5.40552797628 seconds to add NINE fields
 - NumPyExtendTable:AddField ratio = 0.566377619198
Creating fishnet of 1000000 polygons
 - AddField took 28.7716610917 seconds to add NINE fields
 - numpy.array() took 3.61106887681e-05 seconds
 - arcpy.da.ExtendTable() took 34.4609012468 seconds
 - NumPy and ExtendTable took 34.4646480158 seconds to add NINE fields
 - NumPyExtendTable:AddField ratio = 1.19786785706
>>> 

5voto

David Holm Puntos 6165

Usted puede utilizar una lista de comprensión:

narray = numpy.array([], numpy.dtype([('_ID', numpy.int)] +
                                     [('TEST_INTEGER'+str(x), numpy.int) for x in range(numFields)]))

0voto

UnkwnTech Puntos 21942

Mi código final, basado en la Respuesta de @EvilGenius, y los resultados que se producen por interés, se muestran a continuación. Se puede ver que el máximo de beneficio para el uso de los Datos de Acceso y NumPy módulos, a través de múltiples usos de Agregar Campo, para agregar campos adicionales es cuando hay un montón de campos y menos características.

Si usted tiene sólo un par de campos para agregar, y un montón de características, a continuación, utilizando un par de Agregar Campos es más rápido.

Yo estaba esperando que más cuenta siempre la ampliación de la brecha en favor del Acceso a los Datos y NumPy módulos método así que estaba un poco sorprendido de ver el opuesto de la brecha abierta hasta cuando sólo unos pocos campos se agregan con muchas de las funciones involucradas.

import arcpy,numpy,time

fc = r"C:\polygeo\Projects\test.gdb\testFishnet"
fc2 = r"C:\polygeo\Projects\test.gdb\testFishnet2"
cellWidthHeightList = ["0.1","0.01","0.001"]
numFloatFieldsList = [8,28,98]
for numFloatFields in numFloatFieldsList:
    for cellWidthHeight in cellWidthHeightList:
        numCells = (1 / float(cellWidthHeight)) ** 2

        if not arcpy.Exists(r"C:\polygeo\Projects\test.gdb"):
            arcpy.CreateFileGDB_management(r"C:\polygeo\Projects","test.gdb")
        if arcpy.Exists(r"C:\polygeo\Projects\test.gdb\testFishnet"):
            arcpy.Delete_management(r"C:\polygeo\Projects\test.gdb\testFishnet")
        print "Creating fishnet of {0} polygons".format(str(int(numCells)))
        arcpy.CreateFishnet_management(r"C:\polygeo\Projects\test.gdb\testFishnet",
                                       "0 0","0 1",cellWidthHeight,cellWidthHeight,
                                       "#","#","1 1","NO_LABELS","#","POLYGON")
        if arcpy.Exists(r"C:\polygeo\Projects\test.gdb\testFishnet2"):
            arcpy.Delete_management(r"C:\polygeo\Projects\test.gdb\testFishnet2")
        arcpy.Copy_management(r"C:\polygeo\Projects\test.gdb\testFishnet",
                              r"C:\polygeo\Projects\test.gdb\testFishnet2",
                              "FeatureClass")

        start = time.clock()
        for x in range(numFloatFields):
            arcpy.AddField_management(fc,"TEST_FLOAT"+str(x),"DOUBLE","#","#","#","#",
                                      "NULLABLE","NON_REQUIRED","#")
        arcpy.AddField_management(fc,"TEST_INTEGER","LONG","#","#","#","#",
                                  "NULLABLE","NON_REQUIRED","#")
        arcpy.AddField_management(fc,"TEST_TEXT","TEXT","#","#","100","#",
                                  "NULLABLE","NON_REQUIRED","#")
        elapsed = (time.clock() - start)
        print " - AddField took {0} seconds to add {1} fields".format(elapsed,str(2 + numFloatFields))

        start = time.clock()
        narray = numpy.array([],
                             numpy.dtype([('_ID', numpy.int)] +
                                         [('TEST_FLOAT'+str(x), numpy.float) for x in range(numFloatFields)] +
                                         [('TEST_INTEGER', numpy.int),('TEST_TEXT', '|S100')]))
        arcpy.da.ExtendTable(fc2, "OID@", narray, "_ID")
        elapsed2 = (time.clock() - start)
        print " - NumPy and ExtendTable took {0} seconds to add {1} fields".format(elapsed2,str(2 + numFloatFields))
        print " - NumPyExtendTable:AddField ratio = {0}".format(str(float(elapsed2) / float(elapsed)))

>>> ================================ RESTART ================================
>>> 
Creating fishnet of 100 polygons
 - AddField took 7.70771034263 seconds to add 10 fields
 - NumPy and ExtendTable took 0.248153882235 seconds to add 10 fields
 - NumPyExtendTable:AddField ratio = 0.0321955381306
Creating fishnet of 10000 polygons
 - AddField took 6.89551317455 seconds to add 10 fields
 - NumPy and ExtendTable took 0.665253910326 seconds to add 10 fields
 - NumPyExtendTable:AddField ratio = 0.0964763453401
Creating fishnet of 1000000 polygons
 - AddField took 30.2332401168 seconds to add 10 fields
 - NumPy and ExtendTable took 33.6147804976 seconds to add 10 fields
 - NumPyExtendTable:AddField ratio = 1.11184842801

Creating fishnet of 100 polygons
 - AddField took 20.4502870049 seconds to add 30 fields
 - NumPy and ExtendTable took 0.726164488171 seconds to add 30 fields
 - NumPyExtendTable:AddField ratio = 0.0355087675785
Creating fishnet of 10000 polygons
 - AddField took 23.9535915244 seconds to add 30 fields
 - NumPy and ExtendTable took 1.59976486159 seconds to add 30 fields
 - NumPyExtendTable:AddField ratio = 0.0667860124427
Creating fishnet of 1000000 polygons
 - AddField took 129.248610019 seconds to add 30 fields
 - NumPy and ExtendTable took 111.991675127 seconds to add 30 fields
 - NumPyExtendTable:AddField ratio = 0.866482626862

Creating fishnet of 100 polygons
 - AddField took 69.8152053888 seconds to add 100 fields
 - NumPy and ExtendTable took 2.02338346148 seconds to add 100 fields
 - NumPyExtendTable:AddField ratio = 0.0289819882389
Creating fishnet of 10000 polygons
 - AddField took 72.8902866096 seconds to add 100 fields
 - NumPy and ExtendTable took 4.54466470351 seconds to add 100 fields
 - NumPyExtendTable:AddField ratio = 0.0623493872077
Creating fishnet of 1000000 polygons
 - AddField took 418.9092904 seconds to add 100 fields
 - NumPy and ExtendTable took 330.8793667 seconds to add 100 fields
 - NumPyExtendTable:AddField ratio = 0.789859223185
>>> 

Y los resultados de una prueba adicional:

>>> ================================ RESTART ================================
>>> 
Creating fishnet of 100 polygons
 - AddField took 1.828699526 seconds to add 3 fields
 - NumPy and ExtendTable took 0.30685530312 seconds to add 3 fields
 - NumPyExtendTable:AddField ratio = 0.167799738971
Creating fishnet of 10000 polygons
 - AddField took 3.06924217256 seconds to add 3 fields
 - NumPy and ExtendTable took 0.202566890045 seconds to add 3 fields
 - NumPyExtendTable:AddField ratio = 0.0659989921473
Creating fishnet of 1000000 polygons
 - AddField took 2.04487743319 seconds to add 3 fields
 - NumPy and ExtendTable took 9.13887966064 seconds to add 3 fields
 - NumPyExtendTable:AddField ratio = 4.469157668
>>> 

i-Ciencias.com

I-Ciencias es una comunidad de estudiantes y amantes de la ciencia en la que puedes resolver tus problemas y dudas.
Puedes consultar las preguntas de otros usuarios, hacer tus propias preguntas o resolver las de los demás.

Powered by:

X